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NAME 

DjVu - DjVu and DjVuLibre. 

INTRODUCTION 

Although the Internet has given us a worldwide infrastructure on which to build the universal library, much 
of the world knowledge, history, and literature is still trapped on paper in the basements of the world’s tra¬ 
ditional libraries. Many libraries and content owners are in the process of digitizing their collections. While 
many such efforts involve the painstaking process of converting paper documents to computer-friendly 
form, such as SGML based formats, the high cost of such conversions limits their extent. Scanning docu¬ 
ments, and distributing the resulting images electronically is not only considerably cheaper, but also more 
faithful to the original document because it preserves its visual aspect. 

Despite the quickly improving speed of network connections and computers, the number of scanned docu¬ 
ment images accessible on the Web today is relatively small. There are several reasons for this. 

The first reason is the relatively high cost of scanning anything else but unbound sheets in black and white. 
This problem is slowly going away with the appearance of fast and low-cost color scanners with sheet feed¬ 
ers. 

The second reason is that long-established image compression standards and file formats have proved inad¬ 
equate for distributing scanned documents at high resolution, particularly color documents. Not only are 
the file sizes and download times impractical, the decoding and rendering times are also prohibitive. A typ¬ 
ical magazine page scanned in color at 100 dpi in JPEG would typically occupy 100 KB to 200 KB , but the 
text would be hardly readable: insufficient for screen viewing and totally unacceptable for printing. The 
same page at 300 dpi would have sufficient quality for viewing and printing, but the file size would be 300 
KB to 1000 KB at best, which is impractical for remote access. Another major problem is that a fully 
decoded 300 dpi color images of a letter-size page occupies 24 MB of memory and easily causes disk swap¬ 
ping. 

The third reason is that digital documents are more than just a collection of individual page images. Pages 
in a scanned documents have a natural serial order. Special provision must be made to ensure that flipping 
pages be instantaneous and effortless so as to maintain a good user experience. Even more important, most 
existing document formats force users to download the entire document first before displaying a chosen 
page. However, users often want to jump to individual pages of the document without waiting for the entire 
document to download. Efficient browsing requires efficient random page access, fast sequential page flip¬ 
ping, and quick rendering. This can be achieved with a combination of advanced compression, pre-fetching, 
pre-decoding, caching, and progressive rendering. DjVu decomposes each page into multiple components 
(text, backgrounds, images, libraries of common shapes...) that may be shared by several pages and down¬ 
loaded on demand. All these requirements call for a very sophisticated but parsimonious control mecha¬ 
nism to handle on-demand downloading, pre-fetching, decoding, caching, and progressive rendering of the 
page images. What is being considered here is not just a document image compression technique, but a 
whole platform for document delivery. 

DjVu is an image compression technique, a document format, and a software platform for delivering docu¬ 
ments images over the Internet that fulfills the above requirements. 

DJVU IMAGE COMPRESSION 

The DjVu image compression is based on three technologies: 

DjVuPhoto 

DjVuPhoto, also known as IW44, is a wavelet-based continuous-tone image compression technique with 
progressive decoding/rendering. It is best used for encoding photographic images in colors or in shades of 
gray. Images are typically half the size as JPEG for the same distortion. 

DjVuBitonal 

DjVuBitonal, also known as JB2, is a bitonal image compression that takes advantage of repetitions of 
nearly identical shapes on the page (such as characters) to efficiently compress text images. It is best used 
to compress black and white images representing text and simple drawings. A typical 300 dpi page in 
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Dj VuBitonal occupies 5 to 25 KB (3 to 8 times better than TIFF-G4 or PDF ). 

DjVuDocument 

DjVuDocument is a compression technique specifically designed for color digital documents images con¬ 
taining both pictures and text, such as a page of a magazine. DjVuDocument represents images into sepa¬ 
rately compressed layers. The foreground layer is usually compressed with DjVu Bitonal and contains the 
text and drawings. The background layer is usually compressed with DjVuPhoto and contains the back¬ 
ground texture and the pictures at lower resolution. 

DJVU DOCUMENT DELIVERY PLATFORM 

The DjVu technology is designed from the ground up to support the efficient delivery of digital documents 
over the Internet. It provides various ways to deal with multi-page documents, and various ways to enrich 
the content with hyper-links, meta-data, searchable text, etc. 

MIME types 

The DjVu format has an official MIME type of image/vnd.djvu, which is the preferred content-type to be 
given by http servers for DjVu files. Unofficial mime types used historically are image/x.djvu and 
image/x-djvu, which may still be encountered. Ideally, clients should be configured to handle all three. 
(For web server configuration help, see http://www.djvuzone.org/support/tutorial/chapter-author- 
ingl.html.) 

Bundled multi-page documents 

Bundled multi-page DjVu document uses a single file to represent the entire document. This single file 
contains all the pages as well as ancillary information (e.g. the page directory, data shared by several pages, 
thumbnails, etc.). Using a single file format is very convenient for storing documents or for sending email 
attachments. 

When you type the URL of a multi-page document, the DjVu browser plugin starts downloading the whole 
file, but displays the first page as soon as it is available. You can immediately navigate to other pages using 
the DjVu toolbar. Suppose however that the document is stored on a remote web server. You can easily 
access the first page and see that this is not the document you wanted. Although you will never display the 
other pages the browser is transferring data for these pages and is wasting the bandwidth of your server 
(and the bandwidth of the Internet too). You could also see the summary of the document on the first page 
and jump to page 100. But page 100 cannot be displayed until data for pages 1 to 99 has been received. 
You may have to wait for the transmission of unnecessary page data. This second problem (the unneces¬ 
sary wait) can be solved using the “byte serving” options of the HTTP/1.1 protocol. This option has to be 
supported by the web server, the proxies, the caches and the browser. Byte serving however does not solve 
the first problem (the waste of bandwidth). 

Indirect multi-page documents 

Indirect multi-page DjVu documents solve both problems. An indirect multi-page DjVu document is com¬ 
posed of several files. The main file is named the index file. You can browse a document using the URL of 
the index file, just like you do with a bundled multi-page document. The index file however is very small. 
It simply contains the document directory and the URLs of secondary files containing the page data. When 
you browse an indirect multi-page document, the browser only accesses data for the pages you are viewing. 
This can be done at a reasonable speed because the browser maintains a cache of pages and sometimes pre¬ 
fetches a few pages ahead of the current page. This model uses the web serving bandwidth much more 
effectively. It also eliminates unnecessary delays when jumping ahead to pages located anywhere in a long 
document. 

Annotations 

Every DjVu image optionally includes so-called annotation chunks. The annotation chunk is often used to 
define hyper-links to other document pages or to arbitrary web pages. Annotation chunks can also be used 
for other purposes such as setting the initial viewing mode of a page, defining highlighted zones, or storing 
arbitrary meta-data about the page or the document. 
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Hidden text 

Every DjVu image optionally includes a hidden text layer that associated graphical features with the corre¬ 
sponding text. The hidden text layer is usually generated by running an Optical Character Recognition 
software. This textual information provides for indexing DjVu documents and copying/pasting text from 
DjVu page images. 

Thumbnails 

DjVu documents sometimes contain pre-computed page thumbnails. 

Outline 

DjVu documents sometimes contain a navigation chunk containing an outline, that is, a hierarchical table of 
contents with pointers to the corresponding document pages. 

DJVUZONE AND DJVULIBRE 

The DjVu technology was initially created by a few researchers in AT&T Labs between 1995 and 1999. 
Lizardtech, Inc. ( http://www.lizardtech.com ) then obtained a commercial license from AT&T and con¬ 
tinued the development. They have now a variety of solutions for producing and distributing documents 
using the DjVu technology. 

The DjVuZone web site ( http://www.djvuzone.org ) is managed by the few AT&T Labs researchers who 
created the DjVu technology in the first place. We promote the DjVu technology by providing an indepen¬ 
dent source of information about DjVu. 

Understanding how little room there is for a proprietary document format, Lizardtech released the DjVu 
Reference Library under the GNU Public License in December 2000. This library entirely defines the com¬ 
pression format and the elementary codecs. Six month later, Lizardtech released an updated DjVu Refer¬ 
ence Library as well as the source code of the Unix viewer. 

These two releases form the basis of our initial DjVuLibre software. We modified the build system to com¬ 
ply with the expectations of the open source community. Various bugs and portability issues have been 
fixed. We also tried to make it simpler to use and install, while preserving the essential structure of the 
Lizardtech releases. 

The DjVuLibre software contains the following components: 

bzz(l) A general purpose compression command line program. Many internal DjVu data structures are 
compressed using this technique. 

c44(l) A DjVuPhoto command line encoder. This state-of-the-art wavelet compressor produces 
DjVuPhoto images from PPM or JPEG images. 

cjb2(l) A DjVuBitonal command line encoder. This soft-pattem-matching compressor produces DjVu- 
Bitonal images from PBM images. It can encode images without loss, or introduce small changes 
in order to improve the compression ratio. The lossless encoding mode is competitive with that of 
the Lizardtech commercial encoders. 

cpaldjvu(l) 

A DjVuDocument command line encoder for images with few colors. This encoder is well suited 
to compressing images with a small number of distinct colors (e.g. screen-shots). The dominant 
color is encoded by the background layer. The other colors are encoded by the foreground layer. 

csepdjvu(l) 

A DjVuDocument command line encoder for separated images. This encoder takes a file contain¬ 
ing pre-segmented foreground and background images and produces a DjVuDocument image. 

ddjvu(l) 

A command line decoder for DjVu images. This program produces a PNM image representing any 
segment of any page of a DjVu document at any resolution. 

djview(l) 

A stand-alone viewer for DjVu images. This sophisticated viewer displays DjVu documents. It 
implements document navigation as well as fast zooming and panning. 
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nsdejavu(l) 

A web browser plugin for viewing DjVu images. This small plugin allows for viewing DjVu doc¬ 
uments from web browsers. It internally uses djview to perform the actual work. 

djvups(l) 

A command line tool for converting DjVu documents into PostScript. 

djvm(l) 

A command line tool for manipulating bundled multi-page DjVu documents. This program is 
often used to collect individual pages and produce a bundled document. 

djvmcvt(l) 

A command line tool for converting bundled documents to indirect documents and conversely. 
djvused(l) 

A powerful command line tool for manipulating multi-page documents, creating or editing annota¬ 
tion chunks, creating or editing hidden text layers, pre-computing thumbnail images, and more... 

djvutxt(l) 

A command line tool to extract the hidden text from DjVu documents. 

djvudump(l) 

A command line tool for inspecting DjVu files and displaying their internal structure. 

djvuextract(l) 

A command line tool for dis-assembling DjVu image files. 

djvumake(l) 

A command line tool for assembling DjVu image files. 
djvuserve(l) 

A CGI program for generating indirect multi-page DjVu documents on the fly. 

djvutoxml(l), djvuxmlparser(l) 

Command line tools to edit DjVu metadata as XML files. 

DJVU ENCODERS AND ANY2DJVU 

DjVuLibre comes with a variety of specialized encoders, c44(l) for photographic images, cjb2(l) for 
bitonal images, and cpaldjvu(l) for images with few distinct colors. Although these encoders perform well 
in their specialized domain, they cannot handle complex tasks involving segmentation and multipage 
encoding. 

The Lizardtech commercial products (see http://www.lizardtech.com/solutions/document) can perform 
these complex encoding tasks 


Another solution is provided by the compression server at (http://any2djvu.djvuzone.org). This machine 
uses pre-lizardtech prototype encoders from AT&T Labs and performs almost as well as the commercial 
Lizardtech encoders. Please note that the Any2DjVu compression server comes with no guarantee, that 
nothing is done to ensure that your documents will remain confidential, and that there is only one computer 
working for the whole planet. 

CREDITS 

Numerous people have contributed to the DjVu source code during the last five years. Please submit a 
sourceforge bug report to update the following list. 
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NAME 

djview4 - Standalone DjVu viewer 

SYNOPSIS 

djview4 [options] [argument] 

DESCRIPTION 

Standalone viewer for DjVu files. Features include navigating documents, zooming and panning page 
images, producing and displaying thumbnails, displaying document outlines, searching documents for par¬ 
ticular words in the hidden text layer, copying hidden text to the clipboard, saving pages and documents as 
bundled or indirect multi-page files, and printing page and documents. The viewer can simultaneously dis¬ 
play several pages using a side-by-side or a continuous layout. 

COMMAND LINE ARGUMENT AND OPTIONS 

This program can run as a standalone program or as a slave process for the DjVu browser plugin nsdejavu. 
When running as a standalone program, the command line argument argument can be: 

* The filename of a valid DjVu document. 

* A local DjVu document URL of the form: 

file ".IIIpath!name, djvu [?dj vxxo\)\s8ikeyword=value&...] 

The square brackets delimit the optional components of the URL. Various options can be specified 
using a syntax similar to that of CGI arguments. Specifying options in this manner is very useful for a 
browser plugin because there are no command line arguments. In the case of a standalone viewer, all 
options can be specified as command line arguments. 

* An remote DjVu document URL of the form: 

http "JI host I path! name, djvu [?dj vxxo\)\s8ikeyword=value&...] 

https://host/path/name.djvu[?dj\\iopts&keyword=value&...] 

Browsing remote DjVu documents with the standalone viewer is less efficient than using the browser 
plugin. The standalone viewer does not benefit from the browser caching strategies and proxy settings. 
Proxy settings for the standalone viewer can be set independently with the preferences dialog. 

An extensive list of options are recognized. Most options can be specified as command line argument start¬ 
ing with the customary dash character (-) or using the syntax of CGI arguments in the document URL. 
Some options however are only meaningful as command line arguments. Other options are only recognized 
when running the XI1 version of the djview4 program. 

COMMAND LINE OPTIONS 

The following options are only meaningful when specified on the command line. 

-help Display a brief help message. 

-verbose 

Prints informational messages on the console. This option is very useful because it displays mes¬ 
sages about the unrecognized constructs in the DjVu annotation and hyperlink layers. 

-fullscreen, -fs 

Start djview4 in full screen mode. Use the key Fll to exit the full screen mode. 
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- sty 1 e=.v tylen am e 

Specify the graphical user interface style. The recognized values for stylename depend on the 
installed version of the Qt toolkit. Common style names include cde, motif, plastique, platinum, 
and windows. 


Xll OPTIONS 

The following command line options are recognized by the Xll version of the djview4 program. Unlike 
most djview4 options, Xll options that demand an argument do not use the equal character to introduce 
their argument. 

-display displayname 

Specify that the djview4 windows should appear on the XI1 display displayname. 

-geometry WxH+X+Y 

Specify the initial size and position of the first window using the traditional Xll geometry specifi¬ 
cation syntax. The numerical arguments W and H represent the initial window width and height. 
The numerical arguments X and Y indicate the window position relative to the top left corner of 
the screen. 


-name name 

Set the application name. 

-title title 

Set the title of the first window. 


-fn fontname, -font fontname 

Specify the name of the default font used for buttons and menus. The font should be specified 
using a X logical font description string. 

-bg color, -background color 

Specify the default background color for graphical user interface elements. The color should be 
given as a standard Xll color name. 


-fg color, -foreground color 

Specify the default foreground color for graphical user interface elements. The color should be 
given as a standard Xll color name. 


-btn color 

Specify the default button color. The color should be given as a standard Xll color name. 


-ncols count 

Limit the number of colors allocated on a 8 bit display. The default color cube contains 216 dis¬ 
tinct colors. 

-cmap Force the allocation of a private color map on a 8-bit display. This might increase the color quality 
but cause flashing when the viewer window gets activated. 

GENERAL OPTIONS 

The following options can be specified as command line options or can be passed by augmenting the docu¬ 
ment URL using a syntax similar to that of CGI arguments 
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http ://.../file.dj vu?dj vuopts&/v<?y= value&key=value&... 

In order to separate real CGI arguments from these options, the viewer only recognizes keywords that 
appear after the word djvuopts. The keywords key are derived from the option names by removing the ini¬ 
tial dashes. 


-page= pagename 

Display a specific document page. The viewer first searches a page whose identifier matches the 
argument pagename. Otherwise, if pagename is a number preceded by character + or -, the 
viewer performs a displacement relative to the current page. Otherwise, starting from the current 
page and wrapping around, it searches for a page whose title matches the argument pagename. 
Otherwise, if pagename is numerical, it is interpreted as an ordinal page number. Otherwise, and 
finally, the viewer searches a page whose name matches pagename. 

-pageno= pagenumber 

The page searching algorithm for option page can cause ambiguities when page titles can be inter¬ 
preted as numbers. The argument of option pageno is always interpreted as an ordinal page num¬ 
ber. This option is less portable than page because it is not recognized by earlier versions of the 
djvu plugin. When using this option is necessary, it is advisable to use both the page and pageno 
options. 

-zoom =zoomfactor 

Specify the initial zoom factor. Unless the toolbar, pop-up menu and keyboard are disabled, the 
user will be able to change the zoom factor. Legal values for zoomfactor are shown in the below: 


number 

Magnification factor in range 10% to 999%. 

one2one 

Select the "one-to-one" mode. 

width 

Select the "fit width" mode. 

page 

Select the "fit page" mode. 

stretch 

Stretch the image to the plugin window size. 


-showposition= px,py 

Specify a point in the current page that should be as close as possible to the center of the window. 
The horizontal and vertical positions px,py in the current page are given as fractions in range 0 to 
1. For instance, 0,0 designates the upper left corner of the page, 0.5,0.5 is the center, and 1,1 is the 
lower right corner. 


-mod e=modespec 

Specify the initial display mode. Unless the toolbar and pop-up menu are disabled, the user will 
be able to change it. Legal values for modespec are shown in the below: 


color 

Display the color image. 

bw 

Display the foreground mask only. 

fore 

Display the foreground only. 

back 

Display the background only. 

text 

Overlay the hidden text over the color image. 


- ho r _al i g n=/v <?y vv o rd , -halign =keyword 

Specify the horizontal position of the page in the viewer window. (This does not specify what part 
of the page will be shown, but rather how margins will be laid out around the page in the plugin 
window.) Argument keyword must be left, center, or right. 


-ver_align =keyword, -valign =keyword 

Specify the vertical position of the page in the viewer window. (This does not specify what part of 
the page will be shown, but rather how margins will be laid out around the page in the plugin win¬ 
dow.) Argument keyword must be top, center, or bottom. 
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-cache=(yeslno) 

Enable or disable the caching of fully decoded pages of the document. Caching is on by default. 
Caching of documents whose URL does not contain an extension .djvu or .djv is off by default. 


-continuous=(yeslno) 

Enable or disable the continuous layout of multipage documents. 


-sidebyside=(yeslno), -side_by_side=(yeslno) 

Enable or disable the side-by-side layout of multipage documents. 


-coverpage=(yeslno) 

Specify whether the cover page must be displayed alone when multipage documents are shown in 
side-by-side layout. 


-righttoleft=(yeslno) 

Specify whether pages should be arranged right-to-left when multipage documents are shown in 
side-by-side layout. 


-layout =keyword{,key word } 

Specify the layout settings using a single list of comma-separated keywords. The following key¬ 
words are recognized: 


single 

Disable the side-by-side and continuous modes. 

double 

Enable the side-by-side mode. 

continuous 

Enable the continuous mode. 

cover,nocover 

First page treatment in side-by-side mode. 

ltor,rtol 

Layout direction for side-by-side mode. 

gap,nogap 

Specify whether there is a gap between pages. 


-scrollbars=(yeslno) 

Enable or disable the presence of scroll bars when the full image size exceeds the plugin window 
size. The default is yes. 

-frame=(yeslno) 

Enable or disable the display of a thin frame and shadow around the DjVu images. Frames are 
enabled by default. 

-background =color 

Specify the color of the background border displayed around the document. The color color must 
be given in hexadecimal RRGGBB or #RRGGBB format. 


-toolbar=key word{ (, I+1 -)keyword } 

Controls the appearance and the contents of the toolbar. The argument of option toolbar is com¬ 
posed of a number of keywords separated by characters comma, plus or minus. The appearance of 
the toolbar is controlled by keywords placed before the first occurrence of a character plus or 
minus. The following keywords are recognized in this context: 
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no 

Disable toolbar. 

always 

Displays the toolbar. 

auto 

Enable toolbar "autohide" mode (not implemented). 

top 

Place toolbar along the top edge. 

bottom 

Place toolbar along the bottom edge. 


The contents of the toolbar is controlled by keywords placed after the first occurrence of a charac¬ 
ter plus or minus. Each keyword adds (after a plus) or removes (after a minus) a particular toolbar 
button or group of buttons. The initial content of the toolbar is determined by the first occurrence 
of a character plus or minus. When this is a plus, the toolbar is initially empty. When this is a 
minus, the toolbar initially contains the default selection of buttons. 

The following keywords are recognized: 


modecombo 

for the display mode selection tool. 

zoomcombo 

for the zoom selection tool. 

zoom 

for the zoom buttons. 

select 

for the selection button. 

rotate 

for the image rotation buttons. 

find 

for the text search button. 

new 

for the new window button. 

open 

for the open new document button. 

save 

for the save button. 

print 

for the print button. 

layout 

for the page layout buttons. 

pagecombo 

for the page selection tool. 

firstlast 

for the first-page and last-page buttons. 

prevnext 

for the previous- and next-page buttons. 

backforw 

reserved for the back and forward buttons. 

help 

for the contextual help button. 


For the sake of backward compatibility, the keywords fore, forejbutton, back, backjbutton, bw, bwjbut- 
ton, color, and colorjbutton are interpreted like keyword modecombo; the keyword rescombo is a syn¬ 
onym of zoomcombo; the keywords pan, zoomsel, and textsel are interpreted like keyword select; and the 
keyword doublepage is interpreted like keyword layout. All other keywords are ignored. 

-menubar=(yeslno) 

Enable or disable the presence of the menu bar located on top of the window. 

-statusbar=(yeslno) 

Enable or disable the presence of the status bar located at the bottom of the window. 


-sidebar =keyword{,keyword} 

Control the dockable panels. The argument is a comma separated list of keywords. A first group 
of keywords selects which panels are affected. Omitting these keywords selects all panels. A sec¬ 
ond group of keywords then controls the visibility and the position of the selected panels. 
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thumbnails 
outline,bookmarks 
search,find 

specify the thumbnail panel, 
specify the outline panel, 
specify the search panel. 

yes,true 

show the specified panels (default). 

no,false 

hide the specified panels. 

left 

dock specified panels on the left side. 

right 

dock specified panels on the right side. 

top 

dock specified panels on the top side. 

bottom 

dock specified panels on the bottom side. 


-thumbnails =keyword{,keyword } 

Compatibility alias for -side bar=/v<?\ word {,keywo /r/} ,thum bnai Is. 

-outlin e=keyword{,keyword } 

Compatibility alias for -sidebar=keyword{keyword },outline. 

-menu=(yeslno) 

Enable or disable the pop-up menu. 

-keyboard=(yeslno) 

Enable or disable the Dj Vu plugin keyboard shortcuts. The default is yes (enabled). 

-mouse=(yeslno) 

Enable or disable mouse interaction for panning and selecting. The default is yes (enabled). 

-links=(yeslno) 

Enable or disable hyper-links in the DjVu image. Hyper-links are enabled by default. 


-highlight =x,y,w,h[,color] 

Display a highlighted rectangle at the specified coordinates in the current page and with the speci¬ 
fied color. Coordinates x, y, w, and h are measured in document image coordinates (not screen 
coordinates). The origin is set at the bottom left comer of the image. The color color must be 
given in hexadecimal RRGGBB or #RRGGBB format. Multiple highlighted zone can be specified 
and can be interspersed with multiple -pag e=pagename options. 

-find =text 

Highlight occurrences of the given string text. This option works when the document contains a 
hidden text layer. It can be used in conjunction with -sidebar=find to display the text searching 
interface. 

String text can be terminated by slash (/) followed by letters specifying search options. The fol¬ 
lowing letters are recognized 

c Case-sensitive search. 

C Case-insensitive search (default), 
w Search hits start on word boundaries (default). 

W Ignore word boundaries, 

r Regular expression search. 

R String search (default). 

-rotate=(0l90ll80l270) 

Rotate the djvu image by the specified angle expressed in degrees counter-clockwise. 
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-print=(yeslno) 

Enable or disable printing the DjVu document. Printing is enabled by default. 
-save=(yeslno) 

Enable or disable saving the DjVu document. Saving is enabled by default. 


-passive 

Cause the DjVu image to be displayed in a manner similar to an ordinary web image. The default 
zoom factor is changed to page. The toolbar, the status bar, the scrollbars, the menus, and the key¬ 
board shortcuts are disabled. 

-passivestretch 

Cause the DjVu image to be displayed in a manner similar to an ordinary web image. The default 
zoom factor is changed to stretch. The toolbar, the status bar, the scrollbars, the menus, and the 
keyboard shortcuts are disabled. 

-nomenu, -notoolbar, -noscrollbars 

These options were recognized by some versions of the DjVu viewers and are honored for the sake 
of backward compatibility. A warning message is printed when option -verbose is active. 

-logo, -textsel, -search 

These options were recognized by some versions of the DjVu viewers but are currently not imple¬ 
mented by djview4. A warning message is printed when option -verbose is active. 


USAGE 

Most features can be accessed using the menus, the toolbar, the side bar or the pop-up menu shown when 
the right mouse button is depressed over a DjVu image. Detailled help can be accessed by clicking the con¬ 
textual help icon from the toolbar and then clicking on various section of the djview user interface. 

The following table lists some useful key combinations recognized when the djvu document is active: 


Key 

Action 

SHIFT+F1 

Activate the contextual help. 

1,2, and 3 

Change zoom to to 100%, 200% and 300%. 

Up, Down, Left, Right 

Scroll the image in the given direction. 

Home 

Display top left comer of the image. 

End 

Display bottom right comer of the image. 

Control+Home 

Go to the beginning of the document. 

Control+End 

Go to the end of the document. 

Space, Return 

Scroll down or go to next page. 

Backspace 

Scroll up or go to previous page. 

Page Down 

Go to the next page. 

Page Up 

Go to the previous page. 

+, - 

Zoom in and out. 

[,] 

Rotate image. 

W 

Select the "Fit Width" zooming mode. 

P 

Select the "Fit Page" zooming mode. 

CTRL+F, F3 

Search the hidden text layer. 


Handy effects can be achieved by holding modifier keys. Although these keys are configurable from the 
preference dialog, the following table lists the default assignments 


DjVuLibre 


10/11/2001 


7 



DJVIEW4(1) 


DjVuLibre 


DJVIEW4(1) 


Key 

Action 

CTRL+SHIFT 

CTRL 

SHIFT 

Hold these keys to show the magnification lens. 
Hold this key to select an area with the mouse. 
Hold this key to display all hyperlinks. 


CREDITS 


This program was written by Leon Bottou <leonb@users.sf.net> and is distributed under the GNU General 
Public License. This program includes code derived from program tiff2pdf, written by Ross Finlayson and 
released under a BSD license. 


SEE ALSO 

djvu(l), ddjvu(l), nsdejavu(l), djview3(l), tiff2pdf(l) 
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NAME 

nsdejavu - DjVu browser plugin 

SYNOPSIS 

/usr/lib/x86_64-linux-gnu/netscape/plugins/nsdejavu.so 

DESCRIPTION 

The shared library nsdejavu.so uses the Netscape browser plugin API to display DjVu images in in a num¬ 
ber of popular web browsers. Different web browsers provide various level of support for Netscape plug¬ 
ins. Please check section "Browser Compatibility" for instructions on how to enable the DjVu browser 
plugin. 

The DjVuLibre browser plugin works by invoking a standalone viewer with the special command line 
option -netscape. The plugin first searches a program named djview. If this program cannot be found, it 
searches for djview4 and finally djview3. It is always possible to override this search strategy by setting 
the environment variable NPX_DJVIEW to the full path of the desired executable. 


MIME TYPES AND EXTENSIONS 

Typing the URL of a recognized DjVu document in your web browser should automatically invoke the DjVu 
browser plugin. Each browser uses different methods to determine that a particular URL is in fact a DjVu 
document. Web server normally provide a MIME type to web browsers. The official MIME type for DjVu 
documents is image/vnd.djvu. For compatibility with ancient versions of the DjVu viewer, it is common to 
use instead the experimental MIME type image/x-djvu or image/x.djvu. Web servers should be configured 
to send the proper MIME type for DjVu documents. Most web browsers also recognize files ending with 
.djvu or .djv as DjVu files. 

An easy way to check if an http server is giving an appropriate content-type is to invoke the following com¬ 
mand with a URL corresponding to an actual DjVu file on the server. 

curl -u URL I grep Content-Type 

The result should be one of the following, preferably the first. 

Content-Type: image/vnd.djvu 
Content-Type: image/x.djvu 
Content-Type: image/x-djvu 

Any other MIME type indicates a server misconfiguration. 


CGI-STYLE FLAGS 

The behavior of the DjVu browser plugin can be specified by augmenting the URL using a syntax similar to 
that used by the CGI programs. This syntax is described by the following template: 

http ://.../file.dj vu?dj vuopts&keyword=value&keyword=value&... 

The DjVu browser plugin only recognizes keywords that appear after the word djvuopts. The keywords 
recognized by each viewer are listed in the man pages for djview3(l) and djview4(l). Unrecognized key¬ 
words are ignored. The most common keywords are: 

page= pagename 

Specify which page is displayed by name or by ordinal number, 
zoom =zoomfactor 

Set the zoom factor. Legal values for zoomfactor are: 
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number 

Magnification factor in range 10% to 999%. 

one2one 

Select the "one-to-one" mode. 

width 

Select the "fit width" mode. 

page 

Select the "fit page" mode. 

stretch 

Stretch the image to the plugin window size. 


BROWSER COMPATIBILITY 

The DjVu browser plugin has been tested with several popular web browsers: Netscape 4 and 6; Gecko 
based browsers such as Mozilla, Galeon and Firefox; KHTML based browsers such as Konqueror; and 
Opera. Please read the browser documentation to find out where the plugin library should be installed. 


EMBEDDING DJVU IMAGES IN HTML PAGES 

You can integrate DjVu content on an HTML web page with either the <embed> or the <object> tag. This 
method will work even if your web server does not support the DjVu MIME type. The CGI style flags can 
be directly used as attributes of the embedding tag. The following example shows the W3C standard syntax 
with the OBJECT tag: 

<object data="myfile.djvu" type="image/vnd.djvu" 
width="100%" height="100%" > 

<param name="page" value="iii"> 

<param name="zoom" value="stretch"> 

This browser cannot render djvu data. 

</object> 

And this is the customary syntax with the EMBED tag: 

<embed src="myfile.djvu" type="image/vnd.djvu" 
width="100%" height="100%" 
page= " iii" zoom='' stretch'' ></embed> 

INTERFACING THE DJVIEW PLUGIN WITH JAVASCRIPT 

Recent versions of the djview4 plugin can be controlled from the JavaScript intepreter of browsers imple¬ 
menting the Mozilla NPRuntime API. To access the plugin object, include the attribute id =" pluginname" 
into the <object> or <embed> tag and use the JavaScript function getElementByld ("pluginname"). 

The plugin object implements two methods to retrieve and set the value of the options usually recognized as 
CGI-style flags. It also can evaluate a specified JavaScript expression whenever something changes in the 
status of the djview interface. 

pluginobject .setdj vuopt('' key ",value) 

Set the value of the djvu option key to the character string value. This achieves the same effect as 
specifying option key=value among the CGI-style flags. For instance, values of the key page can 
be page IDs, page titles, page numbers, or page names. 

pluginobject .getdj\uopt(" key") 

Return the value of the djvu option key as a string. The returned value is always a character string, 
even when the return is logically a number. Boolean values are returned as strings yes or no. 
Besides the usual CGI-style flags, this function recognizes the additional key pages and returns the 
total number of pages in the DjVu document. An empty string is returned when the key is not rec¬ 
ognized. 

pluginobject .onchange='' code ''; 

Ensure that string code is evaluated in the context of the plugin object whenever something 
changes in the djview graphical user interface. For instance, this evaluation happens when pro¬ 
gressive refinements are painted, and when the user manipulates the image interactively. 
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pluginobject .version 

Return a string describing the plugin version. This property can be used to test whether the djview 
plugin is scriptable in this browser. 

Note that the scriptability feature may not be accessible until the djview plugin is fully loaded. Therefore it 
is advisable to check pluginobject. rsion from the JavaScript onload before calling any other method. 

CREDITS 

This program was written by Andrei Erofeev <andrew_erofeev@ yahoo.com> and was then improved by 
Bill Riemers <docbill@sourceforge.net> and Leon Bottou <leonb@users.sourceforge.net>. 

SEE ALSO 

djvu(l), ddjvu(l), djview4(l) djview3(l) 
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NAME 

any2djvu - Convert .ps/.ps.gz/.pdf to .djvu 

SYNOPSIS 

any2djvu url {filename(s)} 

DESCRIPTION 

Converts files from .ps/.ps.gz/.pdf to .djvu by running them through a web server willing to perform this 
task. 

Invoke with -h switch for usage information. 

ENVIRONMENT 

Non-empty value of DJVU_ONLINE_ACK acknowledges transmission of the documents to the server (so 
that no warning dialog is displayed). 

EXAMPLES 

any2djvu http://www.bcl.hamilton.ie/~barak/papers mesh-preprint.ps.gz 
any2djvu localfile.pdf 

AUTHORS 

David Kreil, Barak A. Pearlmutter, Yaroslav O. Halchenko 

BUGS 

Using a web-based encoder server is a stop-gap measure until better encoders enjoy wide free distribution. 

There is a security issue in operating on documents not intended for widespread distribution, which could 
be partially although not completely ameliorated by using a secure web connection. 

SEE ALSO 

The entire djvu suite, eg djvu(l), djview(l), and djvuserver(l). 
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NAME 

bzz - DjVu general purpose compression utility. 

SYNOPSIS 

Encoding: 

bzz -efblocksize] inputfile outputfile 

Decoding: 

bzz -d inputfile outputfile 

DESCRIPTION 

The first form of the command line (option -e) compresses the data from file inputfile and writes the com¬ 
pressed data into outputfile. The second form of the command line (option -d) decompressed file inputfile 
and writes the output to outputfile. 

OPTIONS 

-d Decoding mode. 

-efblocksize] 

Encoding mode. The optional argument blocksize specifies the size of the input file blocks pro¬ 
cessed by the Burrows-Wheeler transform expressed in kilobytes. The default block sizes is 2048 
KB. The maximal block size is 4096 KB. Specifying a larger block size usually produces higher 
compression ratios and increases the memory requirements of both the encoder and decoder. It is 
useless to specify a block size that is larger than the input file. 

ALGORITHMS 

The Burrows-Wheeler transform is performed using a combination of the Karp-Miller-Rosenberg and the 
Bentley-Sedgewick algorithms. This is comparable to (Sadakane, DCC 98) with a slightly more flexible 
ranking scheme. Symbols are then ordered according to a running estimate of their occurrence frequencies. 
The symbol ranks are then coded using a simple fixed tree and the ZP binary adaptive coder (Bottou, DCC 
98). 

The Burrows-Wheeler transform is also used in the well known compressor bzip2. The originality of bzz 
is the use of the ZP adaptive coder. The adaptation noise can cost up to 5 percent in file size, but this 
penalty is usually offset by the benefits of adaptation. 

PERFORMANCE 

The following table shows comparative results (in bits per character) on the Canterbury Corpus ( 
http://corpus.canterbury.ac.nz ). The very good bzz performance on the spreadsheet file excl puts the 
weighted average ahead of much more sophisticated compressors such as fsmx. 



text 

fax 

csrc 

excl 

Compression performance 
sprc tech poem html 

lisp 

man 

play 

Weighted 

Average 

compress 

3.27 

0.97 

3.56 

2.41 

4.21 

3.06 

3.38 

3.68 

3.90 

4.43 

3.51 

2.55 

3.31 

gzip -9 

2.85 

0.82 

2.24 

1.63 

2.67 

2.71 

3.23 

2.59 

2.65 

3.31 

3.12 

2.08 

2.53 

bzip2 -9 

2.27 

0.78 

2.18 

1.01 

2.70 

2.02 

2.42 

2.48 

2.79 

3.33 

2.53 

1.54 

2.23 

ppmd 

2.31 

0.99 

2.11 

1.08 

2.68 

2.19 

2.48 

2.38 

2.43 

3.00 

2.53 

1.65 

2.20 

fsmx 

2.10 

0.79 

1.89 

1.48 

2.52 

1.84 

2.21 

2.24 

2.29 

2.91 

2.35 

1.63 

2.06 

bzz 

2.25 

0.76 

2.13 

0.78 

2.67 

2.00 

2.40 

2.52 

2.60 

3.19 

2.52 

1.44 

2.16 


Note that DjVu contributors have several entries in this table. Program compress was written some time 
ago by Joe Orost. Program ppmd is an improvement of the PPM-C method invented by Paul Howard. 
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CREDITS 

Program bzz was written by Leon Bottou <leonb@users.sourceforge.net> and was then improved by 
Andrei Erofeev <andrew_erofeev@yahoo.com>, Bill Riemers <docbill@sourceforge.net> and many oth¬ 
ers. 

SEE ALSO 

djvu(l), compress(l), gzip(l), bzip2(l) 
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NAME 

c44 - DjVuPhoto encode. 

SYNOPSIS 

c44 [ options ] inputfilename [ outputfilename ] 

DESCRIPTION 

Produces a DjVuPhoto encoded image. The input image file inputfilename can be either a portable gray- 
map (PGM) or a portable pix-map (PPM). Input images compressed with JPEG are also accepted. It is how¬ 
ever suggested to only use high quality JPEG files (low compression ratio, large size) because the wavelet 
compression will increase the defects already present in highly compressed JPEG files. 

The program produces a DjVuPhoto file outputfilename. If the output file name is not specified, a default 
file name will be generated by replacing the input file name suffix by suffix djvu. 

The main design objective for the DjVu wavelets consisted of allowing progressive rendering and smooth 
scrolling of large images with limited memory requirements. Decoding functions process the compressed 
data and update a memory efficient representation of the wavelet coefficients. Imaging function then can 
quickly render an arbitrary segment of the image using the available data. Both process can be carried out 
in two threads of execution. This design plays an important role in the DjVu system. We investigated vari¬ 
ous state-of-the-art wavelet compression schemes. Although these schemes may achieve slightly smaller 
file sizes, the decoding functions did not even approach our requirements. The IW44 wavelets reach these 
requirements today and may in the future implement more modem refinements if these refinements can be 
implemented within our constraints. 

QUALITY SELECTION OPTIONS 

DjVuPhoto files are logically composed of a sequence of "slices" containing successive image refinements. 
Slices are grouped in "chunks" defining the progressive rendering sequence. The viewer is able to display 
an intermediate image after processing each chunk. A typical DjVuPhoto files contains 80 to 120 slices 
grouped into 1 to 4 chunks. 

The quality selection options provide various ways to specify the number of chunks and the number of 
slices per chunk. The c44 program adds slices to the current chunk until exceeding a target number of 
slices, a target file size, or a target quality specification. The following options define targets for each 
chunk. The option argument contain several numerical values (one per chunk) separated by either commas 
or pluses. 

-slice n+...+n 

Specify the number of slices in each chunk. The option argument contains plus-separated numeri¬ 
cal values (one per chunk) indicating the number of slices per chunk. Option -slice 74+13+10, for 
instance, would be appropriate for compressing a photographic image with three progressive 
refinements. More quality and more refinements can be obtained with option -slice 72+11+10+10. 

-slice n,...,n 

Specify the cumulative number of slices for each chunk. Since the final quality is determined by 
the total number of slices, it is often more convenient to use comma-separated values (one per 
chunk) indicating the cumulative number of slices for each chunk (i.e. including those encoded in 
all previous chunks). The values suggested above can also be expressed as -slice 74,87,97 and 
-slice 72,83,93,103. 

-size n,...,n 

Specify size targets for each chunk expressed in bytes. The option argument can be either a plus- 
separated list specifying a size for each chunk, or a comma separated list specifying cumulative 
sizes for each chunk and all previous chunks. Size targets are approximates. Slices will be added 
to each chunk until exceeding the specified target. 

-bpp n,...,n 

Specify size targets for each chunk expressed in bits-per-pixel. Both comma-separated and plus- 
separated specifications are accepted. Option -bpp 0.25,0.5,1 usually provides good results. 
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-percent 

Specify size targets for each chunk expressed as a percentage of the input file size. Both comma- 
separated and plus-separated specifications are accepted. Results can be drastically different 
according to the format of the input image (raw or JPEG compressed). 

-decibel 

Specify quality targets for each chunk expressed as a comma-separated list of increasing decibel 
values. Decibel values range from 16 (very low quality) to 48 (very high quality). This criterion 
should not be relied upon when re-encoding an image previously compressed by another compres¬ 
sion scheme. Selecting this option significantly increases the compression time. 

-dbfrac frac 

Indicate that the decibel values specified in option -decibel should be computed by averaging the 
mean squared errors of only the fraction frac of the most mis-represented blocks of 32 x 32 pixels. 
This option is useful with composite images containing solid color features (e.g. an image with a 
large white border). 

Providing no quality specification options automatically selects a default quality specification -slice 
74,89,99. Multiple quality specification options are allowed. The program outputs a file whose total num¬ 
ber of chunks is the largest number of chunks of all quality specifications. Slices are added to each chunk 
until reaching any of the quality target for this chunk. 

OTHER OPTIONS 

The following additional options are supported: 

-dpi n Specify the resolution information encoded into the output file expressed in dots per inch. The res¬ 
olution information encoded in DjVu files determine how the decoder scales the image on a partic¬ 
ular display. Meaningful resolutions range from 25 to 1200. The default value, 100 dpi, should be 
suitable for most photographic images. 

-gamma n 

Specify the gamma correction information encoded into the output file. The argument n specified 
the gamma value of the device for which the input image was designed. The default value is 2.2. 
This is appropriate for images designed for a standard computer monitor. 

-mask pbmfilename 

The design of the IW44 wavelets allows for compressing partially masked images. This option can 
be used when certain pixels of a background image are going to be covered by foreground objects 
like text or drawings. File pbmfile must be a PBM file whose size matches the size of the input 
file. Each black pixel in pbmfile means that the value of the corresponding pixel in the input file is 
irrelevant. The IW44 encoder will replace the masked pixels by a color value whose coding cost is 
minimal (see http://www.djvuzone.org/djvu/techpapers/mask/index.djvu for technical details.) 

-crcbnormal 

Select normal chrominance encoding. Chrominance information is encoded at the same resolution 
as the luminance. This is the default. 

-crcbhalf 

Selects half resolution chrominance encoding. Chrominance information is encoded at half the 
luminance resolution. 

-crcbdelay n 

This option can be used with -crcbnormal and -crcbhalf to modify the quality of the chrominance 
information. The option arguments specifies a parameter n, expressed in slices, that reduces the 
bit-rate associated with the chrominance. The default chrominance encoding delay is 10 slices. 

-crcbfull 

Select the highest possible quality for encoding the chrominance information. This is equivalent to 
specifying -crcbnormal and -crcbdelay 0. 
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-crcbnone 

Disable the encoding of the chrominance. Only the luminance information will be encoded. The 
resulting image will show in shades of gray. 

REMARKS 

The default quality setting of the DjVuLibre version of c44 has been increased. It produces larger files with 
a better quality. Quality can be lowered using the quality selection options! 

BUGS 

The encoder requires more memory than necessary. 

The rechunking capability is currently broken. 

CREDITS 

This program was written by Leon Bottou <leonb@users.sourceforge.net> and was then improved by 
Andrei Erofeev <andrew_erofeev@yahoo.com>, Bill Riemers <docbill@sourceforge.net> and many oth¬ 
ers. 

SEE ALSO 

djvu(l), pnm(5), cjpeg(l). 
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NAME 

cjb2 - Simple DjVuBitonal encoder. 

SYNOPSIS 

cjb2 [ options ] inputfile outputdjvufile 

DESCRIPTION 

This is a simple encoder for bitonal files. Argument inputfile is the name of a PBM or bitonal TIFF file con¬ 
taining a single document image. This program produces a DjVuBitonal file named outputdjvufile. 

The default compression process is lossless: decoding the DjVuBitonal file at full resolution will produce 
an image exactly identical to the input file. Lossy compression is enabled by options -losslevel, -lossy, or 
-clean. 

OPTIONS 

-dpi n Specify the resolution information encoded into the output file expressed in dots per inch. The res¬ 
olution information encoded in DjVu files determine how the decoder scales the image on a partic¬ 
ular display. Meaningful resolutions range from 25 to 1200. The default resolution for TIFF files 
is the resolution is the resolution specified by the input file. The default resolution for PBM files is 
300 dpi. 

-lossless 

Ensure that the encoded image is pixel-per-pixel equal to the initial image. This option is is equiv¬ 
alent to -losslevel 0 and is the default. 

-clean Only remove flyspecks from the input image. This option enables a heuristic algorithm that 
removes very small marks. Such marks are often causes by noise and dust during the scanning 
process. The threshold mark size is chosen according to the resolution specified with option This 
option is is equivalent to -losslevel 1. 

-lossy Substitute patterns with small variations. In addition to the flyspeck removal heuristic, this option 
enables an algorithm that encodes certain characters by simply replicating the shape of a previ¬ 
ously encoded character with a similar shape. This option is is equivalent to -losslevel 100. 

-losslevel x 

Specify the aggressiveness of the lossy compression. Its argument ranges from 0 to 200. Higher 
values generate smaller files with more potential distortions. Loss level 0 corresponds to lossless 
encoding. Loss level 1 performs image cleaning but does not perform character substitution at all. 
Loss level 100 is intended to provide a good compromise. Higher loss levels provide marginally 
better compression at the risk of unacceptable character substitutions. 

-verbose 

Display informational messages while running. 


REMARKS 

Lossless encoding is competitive with that of the Lizardtech commercial encoders. 

Lossy encoding has made much progress thanks to Ilya Mezhirov from the minidjvu project. This also 
means that the lossy encoding performance can change from version to version. When lossy compression 
yields inadequate results, simply revert to only using option -clean or reduce the parameter of option 

-losslevel. 

Two features are still missing: 

* Half-tone detection. Collecting small marks belonging to half-tone patterns would improve compression 
speed. 
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* Multi-page compression. Matching characters on several pages would improve the compression ratios 
for multi-page documents. 


CREDITS 

This program was initially written by Leon Bottou <leonb@users.sourceforge.net> and was improved by 
Bill Riemers <docbill@sourceforge.net> and many others. The pattern matching algorithm for lossy com¬ 
pression was contributed by Ilya Mezhirov <ilya@mezhirov.mccme.ru>. TIFF input routines are inspired 
by the ones contributed by R. Keith Dennis <dennis@rkd.math.Cornell.edu> and Paul Young. 


SEE ALSO 

djvu(l), pbm(5). 
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NAME 

cpaldjvu - DjVuDocument encoder for low-color images. 

SYNOPSIS 

cpaldjvu [ options ] inputppmfile outputdjvufile 

DESCRIPTION 

Program cpaldjvu is a DjVuDocument encoder for images containing few colors. It performs best on 
images containing large solid color areas such as screen dumps. Compression ratios on such images can be 
much higher than those achieved by GIF or PNG compression. 

This program works by first reducing the number of distinct colors to a small specified value using a simple 
color quantization algorithm. The dominant color is encoded into the background layer. The other colors 
are encoded into the foreground layer. 

OPTIONS 

-dpi n Specify the resolution information encoded into the output file expressed in dots per inch. The res¬ 
olution information encoded in DjVu files determine how the decoder scales the image on a partic¬ 
ular display. Meaningful resolutions range from 25 to 6000. The default value is 300 dpi. 

-colors n 

Specify a maximum number of distinct colors for the color quantization algorithm, process. The 
default value is 256. Smaller values can produce much smaller files. 

-bgwhite 

Cause the background layer to use the lightest quantified color instead of the dominant color. 

-verbose 

Display informational messages while running. 


REMARKS 

The color quantization might introduce severe degradation if the image contains photographic areas with a 
large number of very similar colors. Color quantization problems might be solved by pre-processing the 
input file with a different quantization program such as ppmquant. Avoid using the error diffusion dither¬ 
ing algorithm. This algorithm generates random dithering patterns that might be very costly to encode. 


BUGS 

This program should be rewritten as a pre-processor for csepdjvu. 

CREDITS 

This program was initially written by Leon Bottou <leonb@users.sourceforge.net> and was improved by 
Bill Riemers <docbill@sourceforge.net> and many others. 

SEE ALSO 

djvu(l), pbm(5), ppmquant(l), pnmtogif(l), pnmtopng(l) 


DjVuLibre-3.5 


10/11/2001 


1 



CSEPDJVU(l) 


DjVuLibre-3.5 


CSEPDJVU(l) 


NAME 

csepdjvu - DjVu encoder for separated data files. 

SYNOPSIS 

csepdjvu [options] [sepfiles]... outputdjvufile 

DESCRIPTION 

This program creates a DjVuDocument file outputdjvufile from separated data files sepfiles. It can read sep¬ 
arated data from the standard input when given a single dash instead of the separated data file names. This 
feature is intended for pre-processing programs that push separated data into csepdjvu via a pipe. 

Each separated data file represents one or more page images. When the program arguments specify multi¬ 
ple pages, all the pages are encoded and saved as a bundled multi-page document. When the program argu¬ 
ments specify a single page, the page is encoded and saved as a single page file. 

OPTIONS 

-d n Specify the resolution information encoded into the output file expressed in dots per inch. The res¬ 
olution information encoded in DjVu files determine how the decoder scales the image on a partic¬ 
ular display. Meaningful resolutions range from 25 to 6000. The default value is 300 dpi. 

-q 

-q n+...+n 

Specify the encoding quality of the IW44 encoded background layer. The option argument con¬ 
tain several integers (one per chunk) separated by either commas or pluses. This option is similar 
to option -slice of program c44. Please refer to the c44(l) man page for additional details. The 
default quality specification is -q 72,83,93,103. 

This option does not apply to uniformly white background that were not specified by the separated 
data but are called for by the DjVu specification. Such background images always come at the 
lowest possible resolution and with a standard quality setting that ensures the color uniformity. 

-t Program csepdjvu interprets certain comments in the separated file to construct a hidden text layer 
in the DjVu file. This layer records the location of each word for hiliting purposes. This option 
reduces the file size by simply recording the location of each line. 

-v Display a brief message describing each page. 

-vv Display extensive informational messages during encoding. 

SEPARATED DATA FILE FORMAT 

Each separated data file contains a concatenation of one or more separated page images. Each page is logi¬ 
cally represented by a foreground image with a transparent color and by a background image visible 
through the transparent pixels. The data for each separated page image is the concatenation of the follow¬ 
ing data blocks: 

* A foreground image encoded using either the "Color RLE format" or the "Bitonal RLE format". These 
formats are described later in this section. 

* An optional background image encoded as a "Portable Pixmap" ( PPM ). This well known format is 
summarized later in this section. The absence of a background image simply indicates that a uniformly 
white background should be assumed. 

* An arbitrary number of comment lines starting with character "#" and terminated by a linefeed charac¬ 
ter. Comment lines whose first word starts with a capital letter have special meanings documented later 
in this document. 

The dimensions (width and height) of the background image must be obtained by rounding up the quotient 
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of the foreground image dimensions by an integer reduction factor ranging from 1 to 12. Assume, for 
instance, that the width of the foreground is 2507 and the reduction factor is 3. The width of the back¬ 
ground image will be the integer ratio (2507+2)/3. 

Color RLE format 

The Color RLE format is a simple run-length encoding scheme for color images with a limited number of 
distinct colors. The data always begin with a text header composed of the two characters "R6", the number 
of columns, the number of rows, and the number of color palette entries. All numbers are expressed in dec¬ 
imal ASCII. These four items are separated by blank characters (space, tab, carriage return, or linefeed) or 
by comment lines introduced by character "#". The last number is followed by exactly one character which 
usually is a linefeed character. 

The header is followed by the color palette containing three bytes per color entry. The bytes represent the 
red, green, and blue components of the color. 

The palette is followed by a collection of four bytes integers (most significant bit first) representing runs of 
pixels with an identical color. The twelve upper bits of this integer indicate the index of the run color in the 
palette entry. The twenty lower bits of the integer indicate the run length. Color indices greater than OxffO 
are reserved. Color index Oxfff is used for transparent runs. Each row is represented by a sequence of runs 
whose lengths add up to the image width. Rows are encoded starting with the top row and progressing 
toward the bottom row. 

Bitonal RLE format 

The Bitonal RLE format is a simple run-length encoding scheme for bitonal images. The data always begin 
with a text header composed of the two characters "R4", the number of columns, and the number of rows. 
All numbers are expressed in decimal ASCII. These three items are separated by blank characters (space, 
tab, carriage return, or linefeed) or by comment lines introduced by character "#". The last number is fol¬ 
lowed by exactly one character which usually is a linefeed character. 

The rest of the file encodes a sequence of numbers representing the lengths of alternating runs of transpar¬ 
ent and black pixels. Lines are encoded starting with the top line and progressing toward the bottom line. 
Each line starts with a white run. The decoder knows that a line is finished when the sum of the run lengths 
for that line is equal to the number of columns in the image. Numbers in range 0 to 191 are represented by 
a single byte in range 0x00 to Oxbf. Numbers in range 192 to 16383 are represented by a two byte 
sequence: the first byte, in range OxcO to Oxff, encodes the six most significant bits of the number, the sec¬ 
ond byte encodes the remaining eight bits of the number. This scheme allows for runs of length zero, which 
are useful when a line starts with a black pixel, and when a very long run (whose length exceeds 16383) 
must be split into smaller runs. 

Portable Pixmap (PPM) format 

The Portable Pixmap format is a well known format for representing color images. Check the ppm(l) man 
page for complete information. 

The data always begin with a text header composed of the two characters "P6", the number of columns, the 
number of rows, and the maximal value of a color component (usually 255). All numbers are expressed in 
decimal ASCII. These three items are separated by blank characters (space, tab, carriage return, or linefeed) 
or by comment lines introduced by character "#". The last number is followed by exactly one character 
which usually is a linefeed character. 

The rest of the file encodes all the pixels. Each pixel is represented by three bytes representing the red, 
green and blue component of the pixel. Pixels are ordered in left to right, top to bottom. 
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Comments in separated files 

Each page is followed by an arbitrary number of comment lines starting with character "#" and terminated 
by a linefeed character. Comment lines whose first word starts with a capital letter have special meanings. 
The following constructs are currently defined: 

* # T px'.py dx'.dy wxh+x+y ( string) 

This constructs indicates that the piece of text string must be associated with an area of size wxh at 
position x,y relative to the lower left comer of the page. The string is UTF-8 encoded. Special charac¬ 
ters can be escaped as in PostScript using the backslash character. Integers px, and py represent the 
position of the current point on the text baseline before the text was drawn. The drawing operation then 
moves the current point by dx, and dy pixels. When such comments are present, csepdjvu produces a 
hidden text layer for the corresponding pages. 

* # L wxh+x+y ( url ) 

This construct indicates that an hyperlink to url url should be associated with area of size wxh at posi¬ 
tion x,y. When such comments are present, csepdjvu produces pages with an annotation chunk con¬ 
taining the specified hyperlinks. 

* # B count ( string ) ( #pageno ) 

This constructs provides outline information for the document. An outline entry entitled string is asso¬ 
ciated with page pageno. Integer count indicates how many of the following outline entries must be 
attached to the current entry as subentries. When such comments are present in the first page csepdjvu 
produces an navigation chunk with the specified outline. 

CREDITS 

This program was initially written by Leon Bottou <leonb@users.sourceforge.net> and was improved by 
Bill Riemers <docbill@sourceforge.net> and many others. 

SEE ALSO 

djvu(l), ppm(5), c44(l) 
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NAME 

ddjvu - Command line DjVu decoder. 

SYNOPSIS 

ddjvu -format=/m/ [ options ] [ djvufile ] [ outputfile ] 

DESCRIPTION 

Decode the DjVu file djvufile, produces the image file outputfile. 

The DjVu data is read from the standard input when argument djvufile is not specified or when it is equal to 
a single dash. Similarly, the output data is written to the standard output when argument outputfile is not 
specified or equal to a single dash. However a valid output file name is always required when producing a 
TIFF or PDF file. 

MAIN OPTIONS 

-format=/m/ 

Specify the output file formats. The recognized file formats are pbm, pgm, ppm, pnm, rle, tiff, 
and pdf. 

* Formats pbm, pgm, and ppm respectively produce a Portable Bitmap (PBM), Portable 
Graymap (PGM), or Portable Pixmap (PPM) file. Format pnm produces a PBM, PGM, or 
PPM output file according to the color content of the output image. 

* Format rle produces a compact run length encoded bitonal file that is understood by the 
DjVuLibre commands cjb2 and csepdjvu. 

* Format tiff produces a Tagged Image Format (TIFF) file using lossless compression. Enabling 
lossy JPEG compression (see option -quality below) often produces much smaller files. Com¬ 
mands tiffcp(l) and tiffsplit(l) are useful for manipulating the resulting TIFF files. 

* Format pdf produces a Portable Document Format (PDF) file. Each page in the resulting file 
is represented by an image at the specified resolution, using lossless compression. Enabling 
lossy JPEG compression (see option -quality below) often produces much smaller files. An 
alternate way to produce PDF file consists in first using djvups(l) and convert the resulting 
PostScript file to PDF. Which method gives better results depends on the contents of the 
DJVU file and on the capabilities of the PS to PDF converter. 

When option -format is not specified, the extension of argument outputfile has no influence on the 
default output format. Instead the program behavior is modified to ensure backward compatibility 
with previous versions of ddjvu. We recommend to always specify the output format using this 
option. 

-page= page spec 

Specify which pages should be decoded. When this option is not specified, all pages of the docu¬ 
ments are decoded and concatenated into the output file. The page specification pagespec con¬ 
tains one or more comma-separated page ranges. A page range is either a page number, or two 
page numbers separated by a dash. For instance, specification 1-10 outputs pages 1 to 10, and 
specification 1,3,99999-4 outputs pages 1 and 3, followed by all the document pages in reverse 
order up to page 4. 

-eachpage 

When this option is specified, program ddjvu generates one separate file per page named by 
replacing the %d specification in outputfilename by the page number in a manner simular to the 
printf(3) function. 

-mod e=mod 

Selects which layers of the DjVu image should be rendered. Valid rendering modes are color, 
black, mask, foreground, and background. 
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* Rendering mode color is the default mode. When the DjVu file is bitonal, bitonal or gray-level 
output is produced depending on the subsampling factor. Otherwise a color image is produced. 

* Rendering mode black is useful to extract a meaningful black and white image, bitonal or 
gray-level output is produced depending on the subsampling factor. 

* Rendering modes mask, foreground, and background select specific layers of a DjVu image. 
These modes can fail if the DjVu image does not contain the selected layer. 

-skip Instead of aborting when encountering a corrupted page, this option causes ddjvu to simply skip 
the corrupted page and continue with the next. This is useful for processing certain damaged files. 

RESOLUTION OPTIONS 

The following options control the resolution of the output image. The default resolution is the native reso¬ 
lution of the DjVu file, equivalent to selecting -1. 

-n Specify an integer sub-sampling factor. The dimensions of the full output image will be n times 
smaller than the DjVu image size. The legal values for argument n range from 1 to 12. Option -1, 
for instance, produces an output image whose resolution is equal to the resolution of the input 
DjVu image file. 

-subsample=n 

This is equivalent to option -n. 

-seal e=mag 

Specify a magnification factor relative to the resolution stored in the DjVu image. Specifying 
magnification of 100 produces an image suitable for displaying on a 100 dpi device such as a com¬ 
puter screen. The magnification factor mag can also be interpreted as the resolution of the output 
image expressed in dot per inch. 

-siz e=wxh 

Specify the size of the full output image. Rendering the full DjVu image would create an output 
image whose width and height would not exceed w and h. To change the aspect ratio, you must 
also use option -aspect=no. 

-aspect —yesno 

This option indicates whether the image aspect ratio should be preserved. The defaults is to pre¬ 
serve the aspect ration. This option permits changes in the aspect ratio when used in combination 
with option -size. 

OTHER OPTIONS 

-verbose 

Display informational messages describing the structure of the DjVu image and the format of the 
output file. 

-segment=wx/i+x+y 

Specify an image segment to render. Program ddjvu conceptually renders the full page using the 
specified resolution, and then extracts a sub-image of width w and height h, starting at position 
(x,y) relative to the bottom left comer of the page. Both operations of course happen simultane¬ 
ously. Rendering a small sub-image is much faster than rendering the complete image. The out¬ 
put file will always have size wxh when this option is specified. 

-quality= factor 

Enables lossy JPEG compression for TIFF and PDF files. This option only affects images that 
cannot be encoded using the preferred TIFF/G4 compression. Argument factor is a quantization 
factor ranging from 25 to 150. See command cjpeg(l) for more information on JPEG quantization 
factors. Value 80 is a good starting point. 
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-quality=uncompressed 

Completely disables compression in TIFF and PDF files. Although the resulting files are often 
huge, this is sometimes useful for maximal compatibility with hastily written software. 

-quality=deflate 

Enables DEFLATE compression for TIFF files. Images that cannot be encoded using the preferred 
TIFF/G4 compression will be encoded with DEFLATE compression if available. Otherwise the 
more portable PACKBITS compression is used. Specifying this option is not necessary for PDF 
files because this is the default behavior. 

DEPRECATED OPTIONS 

Various options have been maintained to ensure backward compatibility with previous versions of ddjvu. 
When option -format is not specified, the program only decodes the first page of the document and the 
default resolution becomes -scale=100. Options -size, -scale, -segment, and -page accept an argument sep¬ 
arated by a space. Options -foreground, -background, and -black are shorthands for the -mod e=mod 
option. Please do not rely on these features. 


EXAMPLES 

Command 

ddjvu -format=tiff myfile.djvu myfile.tif 
decodes all pages and produces a multipage TIFF file. 

Command 

ddjvu -format=ppm -page=l-10 -eachpage -size=100xl00 myfile.djvu thumb%03d.ppm 

produces 100x100 thumbnails for the first ten page of a document and outputs them as PPM files named 

thumbOOl.ppm to thumbOlO.ppm. 

CREDITS 

The new version of this program was written by Leon Bottou <leonb@users.sourceforge.net>. 

This program includes code derived from program tiff2pdf, written by Ross Finlayson and released under a 
BSD license. 

SEE ALSO 

djvu(l), djview(l), pnm(5), pbm(5), pgm(5), ppm(5), cjpeg(l), tiffsplit(l), tiffcp(l), printf(3) 
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NAME 

djvm - Manipulate bundled multi-page DjVu documents. 

SYNOPSIS 

Creating a bundled document: 

djvm -c[reate] doc.djvu pagel.djvu ... pageN.djvu 

Inserting: 

djvm -i[nsert] doc.djvu page.djvu [ pagenum ] 

Removing: 

djvm -d[elete] doc.djvu pagenum 

Listing: 

djvm -l[ist] doc.djvu 

DESCRIPTION 

This program creates or modifies a bundled multi-page DjVu document. Multi-page bundled documents 
can be used directly or converted to indirect document using command djvmcvt. 

OPTIONS 

-c[reate] 

Create a bundled DjVu document named doc.djvu by collecting files pagel.djvu to pageN.djvu. 

-i[nsert] 

Modify the bundled DjVu document named doc.djvu by inserting file page.djvu as page pagenum. 
Omitting argument pagenum means that the page should be appended at the end of the document. 
File page.djvu also can be a multi-page DjVu document. All pages will be inserted at the speci¬ 
fied location. 


-d[elete] 

Remove page pagenum from the bundled multi-page DjVu document doc.djvu. 

-l[ist] List all component files in the multi-page DjVu document doc.djvu. 

CREDITS 

This program was initially written by Andrei Erofeev <andrew_erofeev@ yahoo.com> and was improved 
by Bill Riemers <docbill@sourceforge.net> and many others. 


SEE ALSO 

djvu(l), djvmcvt(l) 
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NAME 

djvmcvt - Convert multi-page DjVu documents. 

SYNOPSIS 

Creating a bundled document: 

djvmcvt -b[undled] docin.djvu docout.djvu 

Creating an indirect document: 

djvmcvt -i[ndirect] docin.djvu dir index.djvu 

DESCRIPTION 

This program converts any multi-page DjVu document to either the bundled or indirect multi-page format. 
The input file docin.djvu must be either the file name of a bundled document or the index file of an indirect 
document. 

OPTIONS 

-b[undled] 

Create a bundled multi-page DjVu document named docout.djvu. 

-i[ndirect] 

Create an indirect multi-page DjVu document. All the files composing the indirect document will 
be stored into directory dir. The index file will be named index.djvu. 


CREDITS 

This program was initially written by Andrei Erofeev <andrew_erofeev@ yahoo.com> and was improved 
by Bill Riemers <docbill@sourceforge.net> and many others. 


SEE ALSO 

djvu(l), djvmcvt(l) 
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NAME 

djvudigital - creates DjVu files from PS or PDF files. 

SYNOPSIS 

djvudigital [ options ] inputfile [ outputfile ] 

DESCRIPTION 

This program creates a DjVu file from the PostScript (.ps), GZipped PostScript (.ps.gz), Encapsulated Post¬ 
Script (.eps), or Portable Document Format (.pdf) file inputfile. 

The output file name is either given by argument outputfile or generated by replacing the input file name 
suffixes by the DjVu suffix (.djvu). 

PREREQUISITES 

This program depends on a specific GhostScript driver. If your GhostScript program does not provide this 
driver, please check http://djvu.sourceforge.net/gsdjvu.html. 

OPTIONS 

—verbose, -v 

Displays more informational messages while converting the file. 

—quiet, -q 

Do not display informational messages while converting the file. 

—dpi resolution 

Specify the desired resolution to resolution dots per inch. The default is 300 dpi. 

—psrotate=ang/e 

Rotate the PostScript file by angle degrees clockwise. Only the values 0, 90, 180, and 270 are 
supported. This option only applies to PostScript files. PDF files are always converted according 
to their native orientation. 

—epsf —disposition 

Specify how to handle Encapsulated PostScript files. Argument disposition can take the values 
crop, fit, and ignore. The default disposition crop creates a DjVu file whose size matches the 
bounding box of the Encapsulated PostScript file. Value fit rescales the graphics to the default 
page size. Value ignore disables all Encapsulated PostScript specific code. This option requires 
Ghostscript 7.07 or better. 

-exact-color 

Enables a more accurate rendering of the colors. This option requires GhostScript 6.52 or better. 

—threshold=t/irc5 

Specify a threshold for the foreground/background separation code. Acceptable values of thres 
range from 0 to 100. Larger values place more information into the foreground layer. The default 
threshold value is 80. 

—bg-siibsample=s7//; 

Specify the background subsampling ratio. Argument sub must be an integer between 1 and 6. 
The default value is 3. 

—bg-slices=n+...+n 

Specify the encoding quality of the background layer. The syntax for the argument is similar to 
that described for the -slice option of command c44. The default is 72+11+10+10. 

—fg-colors —ncolors 

Specify the maximum number of distinct colors in the foreground layer. Argument ncolors can 
take integer values between 1 and 4000. The default value is 256. 
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—fg-image-colors=nco/or5 

Specify the maximum number of distinct colors in an image for considering encoding it into the 
foreground layer. Argument ncolors can take integer values between 1 and 4000. The default 
value is 256. 

—words 

Extract the text from the PostScript code and incorporates this information into the DjVu file. This 
option records the location of every word. 

—lines Extract the text from the PostScript code and incorporates this information into the DjVu file. This 
option saves a few bytes by only recording the location of each line. 

-gsarg =argl[,arg2,...,argN] 

Insert extra arguments on the GhostScript command line. 

—cseparg=argi [ ,arg2,...,argN ] 

Insert extra arguments on the command line of program csepdjvu or msepdjvu. 

—sepfile 

Produces a separated data file instead of a DjVu file. Program csepdjvu can then convert the sepa¬ 
rated data file into a DjVu file. 

—check 

Display the names of the two auxiliary programs found by djvudigital, namely a suitable ghost- 
script interpreter and a suitable backend encoder. See the next two section for details. 

—dryrun 

Simply display the ghostscript command line generated by djvudigital without running it. No 
output file is produced 

—help Display the manual page for djvudigital. 

GHOSTSCRIPT ISSUES 

Program djvudigital internally relies on a specific Ghostscript driver named djvusep. This driver analyzes 
the logical structure of the sequence of PostScript rendering commands and decides to execute each com¬ 
mand into either the foreground or the background layer. The GhostScript driver produces a separated data 
file that is then compressed using the DjVuLibre program csepdjvu. 

Before processing the input file, program djvudigital searches a Ghostscript executable providing the 
djvusep driver. The search starts with the file specified by the environment variable GSDJVU and contin¬ 
ues with command line executables named gs and gsdjvu. 

The DjVuLibre source code contains instruction to compile such a GhostScript executable. More informa¬ 
tion can be obtained from http://djvu.sourceforge.net/gsdjvu.html. 

CSEPDJVU ISSUES 

The output of the djvusep GhostScript driver must be processed by the DjVuLibre program csepdjvu. This 
program can also be replaced by the the proprietary Lizardtech program msepdjvu. Before processing the 
input file, program djvudigital searches such an executable. The search starts with the file specified by the 
environment variable CSEPDJVU and continues with command line executables named msepdjvu and 
csepdjvu. 

CREDITS 

The first version of this converter was written by Leon Bottou <leonb@users.sourceforge.net> in AT&T 
Labs. The DjVuLibre version is derived from code graciously released by Lizardtech in January 2004. 


BUGS 

Program djvudigital can only process input files that GhostScript can process properly. 
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SEE ALSO 

djvu(l), csepdjvu(l), c44(l), gs(l), gzip(l) 
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NAME 

djvudump - Display internal structure of DjVu files. 

SYNOPSIS 

djvudump [-o outputfile ] djvufiles... 

DESCRIPTION 

Program djvudump prints an indented representation of the chunk structure of any DjVu files. Each line 
represent contains a chunk ID followed by the chunk size. Lines are indented in order to reflect the hierar¬ 
chical structure of the IFF files. The page identifier is printed between curly braces when a bundled multi¬ 
page DjVu document is recognized. Additional information about each chunk is provided when djvudump 
recognizes the chunk name and knows how to summarize the chunk data. 

REMARKS 

This program is in fact able to describe any file complying with the Electronic Arts IFF 85 specification. 
This includes a number of graphical and sound file formats. 

CREDITS 

This program was written by Leon Bottou <leonb@users.sourceforge.net> and was then improved by 
Andrei Erofeev <andrew_erofeev@yahoo.com>, Bill Riemers <docbill@sourceforge.net> and many oth¬ 
ers. 

SEE ALSO 

djvu(l) 
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NAME 

djvuextract - Extract chunks from DjVu image files. 

SYNOPSIS 

djvuextract [-pag e=pagenum] djvufile [chkid=filename]... 

DESCRIPTION 

Program djvuextract extracts raw chunk data from a DjVu file djvufile. These chunks can then be re¬ 
assembled into DjVu files using program djvumake. 

Option -page can be used to specify a particular page. Otherwise the first page of the document is 
assumed. Each remaining argument specifies that the raw data associated with all the chunks named chkid 
will be concatenated into the file named filename. Chunks named BG44 and FG44 are handled slightly dif¬ 
ferently: the program generates legal IW44 files instead of simply saving the raw data. 

See the man page djvumake(l) for related information. 

CREDITS 

This program was written by Leon Bottou <leonb@users.sourceforge.net> and was then improved by 
Andrei Erofeev <andrew_erofeev@yahoo.com>, Bill Riemers <docbill@sourceforge.net> and many oth¬ 
ers. 


SEE ALSO 

djvu(l), djvumake(l) 
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NAME 

djvumake - Assemble DjVu image files. 

SYNOPSIS 

djvumake djvufile [ chkid=argument ]... 

DESCRIPTION 

Program djvumake assembles a single-page DjVu file djvufile by copying or creating chunks according to 
the provided arguments. Supported arguments are as follows: 

INFO =w,h,dpi 

Create the initial information chunk. Arguments w, h, and dpi describe the width, height and reso¬ 
lution of the image. All arguments may be omitted. The default resolution is 300 dpi. The default 
width and height will be retrieved from the first mask chunk specified in the command line 
options. 

Sjbz= jb2file 

Create a JB2 foreground mask chunk. File jb2file may contain raw JB2 data, or be a DjVu file 
containing JB2 data such as those produced by program cjb2. 

Smmr =mmrfile 

Create a MMR/G4 foreground mask chunk. File mmrfile may contain raw MMR data or be a DjVu 
file containing MMR data. 

BG44 =iw44file[:n\ 

Create one or more IW44 background chunks. File iw44file must contain IW44 data. Such files 
can be obtained by compressing the background image with program c44 and extracting the raw 
IW44 data using program djvuextract. The optional argument n indicates the number of chunks 
to copy from the IW44 file. Omitting the number of chunks copies all available chunks. 

BGjp =jpegfile 

Create a JPEG encoded background chunk. File jpegfile must contain JPEG encoded data. 

BG2k= jpegfile 

Create a JPEG-2000 background chunk. File jpegfile must contain JPEG-2000 encoded data. The 
DjVu decoder does not yet display files containing JPEG-2000 data. 

FGbz =(filename\{#color[:x,y,w,h]}) 

Create a foreground color chunk describing one solid color for each JB2 encoded mark. The argu¬ 
ment can be the name filename of a file containing the raw data. Such files are best created using 
program djvuextract(l). Alternatively the argument could describe a sequence of color zones. 
Each color zone specifies a color name color, and optionally the coordinates x,y,w,h of a rectan¬ 
gle. Each mark receives the color of the last color zone whose rectangle intersects the bounding 
box of the mark. The mark is painted black if its bounding box does not intersect one of the 
zones. The rectangle coordinates are expressed in pixels with the origin at the bottom left corner 
of the page. The full page is assumed when no rectangle coordinates are specified. Color names 
can be specified with exactly six hexadecimal digits, e.g. FGbz=#FF8080, or by one of the fol¬ 
lowing sixteen HTME color names defined by the W3C, e.g. FGbz=#red. 


aqua 

black 

blue 

fuchsia 

gray 

green 

lime 

maroon 

navy 

olive 

purple 

red 

silver 

teal 

white 

yellow 


FG44 =iw44file 

Create a IW44 foreground color chunk. File iw44file must contain IW44 data. Such files can be 
obtained by compressing the background image with command c44 and extracting the raw IW44 
data using program djvuextract. Only the first chunk is copied. 
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FGjp =jpegfile 

Create a JPEG foreground color chunk. 

FG2k= jpegfile 

Create a JPEG-2000 foreground color chunk. The DjVu decoder does not yet display files contain¬ 
ing JPEG-2000 data. 

INCL= fileid 

Create a DjVu3 include chunk pointing to the component file named fileid. The resulting file 
should then be included into a multipage document using command djvm. 

Djbz= jb2file 

Create a JB2 shape dictionary. File jb2file must contain raw JB2 data describing a JB2 dictionary. 
PPM= ppmfile 

Create a IW44 background chunk and a IW44 foreground color chunk by masking and subsampling 
the PPM file ppmfile. 

Assume, for instance, that we have a PPM image myimage.ppm and an identically sized PBM 
bitonal image mymask.pbm whose black pixels indicate which pixels belong to the foreground. 
Such a bitonal file might have been obtained by thresholding or color-keying the PPM image. We 
can then produce a DjVuDocument image using the following two commands: 

cjb2 mymask.pbm mymask.djvu 

djvumake my.djvu Sjbz=mymask.djvu PPM=myimage.ppm 

The DjVu specification documents in the directory doc of the DjVuLibre distribution provide the authorita¬ 
tive information about the composition of a legal DjVu image file. 

CREDITS 

This program was written by Leon Bottou <leonb@users.sourceforge.net> and was then improved by 
Andrei Erofeev <andrew_erofeev@yahoo.com>, Bill Riemers <docbill@sourceforge.net> and many oth¬ 
ers. 

SEE ALSO 

djvu(l), djvuextract(l), cjb2(l), c44(l) 
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NAME 

djvups - Convert DjVu documents to PostScript. 

SYNOPSIS 

djvups [options] [djvufile] [outputfile] 

DESCRIPTION 

This program decodes DjVu file djvufile, and generates a PostScript file named outputfile. The DjVu data is 
read from the standard input when argument djvufile is not specified or when it is equal to a single dash. 
Similarly, the output data is written to the standard output when argument outputfile is not specified or 
equal to a single dash. 

PostScript printers have various capabilities. Investigate options -level and -gray for obtaining the best 
results. 

OPTIONS 

-help Prints the list of recognized options. 

-verbose 

Displays a progress bar. 

-page =page spec 

Specify the document pages to be converted. The page specification pagespec contains one or 
more comma-separated page ranges. A page range is either a page number, or two page numbers 
separated by a dash. Specification 1-10, for instance, prints pages 1 to 10. Specification 
1,3,99999-4 prints pages 1 and 3, followed by all the document pages in reverse order up to page 
4. 

-format=ps 

Produce a PostScript file. This is the default. 

-format=eps 

Produce an Encapsulated PostScript file. Encapsulated PostScript files are suitable for embedding 
images into other documents. Encapsulated PostScript file can only contain a single page. Setting 
this option overrides the options -copies, -orientation, -zoom, -cropmarks, and -booklet. 

-copies=n 

Specify the number of copies to print. 

-orientation=6>/7<?m 

Specify whether pages should be printed using the auto, portrait, or landscape orientation. 

-mod e=modespec 

Specify how pages should be decoded. The default mode, color, renders all the layers of the DjVu 
documents. Mode black only renders the foreground layer mask. This mode does not work with 
DjVuPhoto images because these files have no foreground layer mask. Modes foreground and 
background only render the foreground layer or the background layer of a DjVuDocument image. 

-zoom =zoomspec 

Specify a zoom factor zoomspec. The default zoom factor, auto, scales the image to fit the page. 
Argument zoomspec also can be a number in range 25 to 2400 representing a magnification per¬ 
centage relative to the original size of the document. 

-frame=ye.v w; 

Specifying yes causes the generation of a thin gray border representing the boundaries of the docu¬ 
ment pages. The default is no. 
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-cropmarks=yesno 

Specifying yes causes the generation of crop marks indicating where pages should be cut. The 
default is no. 

-level =languagelevel 

Select the language level of the generated PostScript, languagelevel. Valid language levels are 1, 
2, and 3. Level 3 produces the most compact and fast printing PostScript files. Some of these files 
however require a very modem printer. Level 2 is the default value. The generated PostScript files 
are almost as compact and work with all but the oldest PostScript printers. Level 1 can be used as 
a last resort option. 

-color —yesno 

The default value, yes, generates a color PostScript file. Specifying value no converts the image to 
gray scale. The resulting PostScript file is smaller and marginally more portable. 

-gray This option is equivalent to option -color=no and is provided for convenience. 

-colormatch=ye.vn6> 

The default value, yes, generates a PostScript file using device independent colors in compliance 
with the sRGB specification. Modem printers then produce colors that match the original as well 
as possible. Specifying value no generates a PostScript file using device dependent colors. This is 
sometimes useful with older printers. You can then use option -gamma to tune the output colors. 

-gamma =gammaspec 

Specify a gamma correction factor for the device dependent PostScript colors. Argument gam- 
maspec must be in range 0.3 to 5.0. Gamma correction normally pertains to cathodic screens only. 
It gets meaningful for printers because several models interpret device dependent RGB colors by 
emulating the color response of a cathodic tube. 

-booklet=opf 

Turns the booklet printing mode on. The booklet mode prints two pages on each side in a way 
suitable for making a booklet by folding the sheets. Option opt can take values no for disabling 
the booklet mode, yes for enabling the recto/verso booklet mode, and recto or verso to print only 
one side of each sheet. 

-bookletmax=///<7\ 

Specifies the maximal number of pages per booklet. A single printout might then be composed of 
several booklets. Argument max is rounded up to the next multiple of 4. Specifying 0 sets no 
maximal number of pages and ensures that the printout will produce a single booklet. This is the 
default. 

-bookletalign=a/zgn 

Specifies a positive or negative offset applied to the verso of each sheet. Argument align is 
expressed in points (one point is l/72th of an inch, or 0.352 millimeter) This is useful with certain 
printers to ensure that both recto and verso are properly aligned. The default value is of course 0. 

-bookletfold=/?a.se/ +incr] 

Specifies the extra margin left between both pages on a single sheet. The base value base is 
expressed in points (one point is l/72th of an inch, or 0.352 millimeter). This margin is incre¬ 
mented for each outer sheet by value incr expressed in millipoints. The default value is 18+200. 


CREDITS 

This program was written by Leon Bottou <leonb@users.sourceforge.net>, Andrei Erofeev <andrew_ero- 
feev@yahoo.com>, and Florin Nicsa. 


SEE ALSO 

djvu(l), ddjvu(l). djview(l) 
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NAME 

djvused - Multi-purpose DjVu document editor. 

SYNOPSIS 

djvused [options] djvufile 


DESCRIPTION 

Program djvused is a powerful command line tool for manipulating multi-page documents, creating or 
editing annotation chunks, creating or editing hidden text layers, pre-computing thumbnail images, and 
more. The program first reads the DjVu document djvufile and executes a number of djvused commands. 

Djvused commands can be read from a specific file (when option -f is specified), read from the command 
line (when option -e is specified), or read from the standard input (the default). 

OPTIONS 

-v Cause djvused to print a command line prompt before reading commands and a brief message 
describing how each command was executed. This option is very useful for debugging djvused 
scripts and also for interactively entering djvused commands on the standard input. 

-f scriptfile 

Cause djvused to read commands from file scriptfile. 

-e command 

Cause djvused to execute the commands specified by the option argument commands. It is advis¬ 
able to surround the djvused commands by single quotes in order to prevent unwanted shell expan¬ 
sion. 

-s Cause djvused to save the file djvufile after executing the specified commands. This is similar to 
executing command save immediately before terminating the program. 

-u Cause djvused to print hidden text and annotations as UTF-8 instead of encoding non-ASCII char¬ 
acters with octal escape sequences for maximal portability. This option is convenient for manually 
editing or viewing the djvused output. This option also causes the emission of an UTF-8 BOM 
under Windows. 

-n Cause djvused to disregard save commands. This is useful for debugging djvused scripts without 
overwriting files on your disk. 

DJVUSED EXAMPLES 

There are many ways to use program djvused. The following examples illustrate some common uses of 
this program. 

Obtaining the size of a page 

Command size outputs the width and height of the selected pages using a HTML friendly syntax. For 
instance, the following command prints the size of page 3 of document myfile.djvu. 

djvused myfile.djvu -e ’select 3; size’ 

Extracting the hidden text 

Command print-pure-txt outputs the text associated with a page or a document. For instance, the follow¬ 
ing shell command outputs the text for the entire document. Lines and pages are delimited by the usual 
control characters. 

djvused myfile.djvu -e ’print-pure-txt’ 

Command print-txt produces a more extensive output describing the structure and the location of the text 
components. The syntax of this output is described later in this man page. For instance, the following shell 
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command outputs extended text information for page 3 of document myfile.djvu. 
djvused myfile.djvu -e ’select 3; print-txt’ 

Extracting the annotations 

Annotation data can be extracted using command print-ant. The syntax of the annotation data is described 
later in this man page. For instance, the following shell command outputs the annotation data for the first 
page of document myfile.djvu. 

djvused myfile.djvu -e ’select 1; print-ant’ 

Command print-ant only prints the annotations stored in the selected component file. Command print- 
merged-ant also retrieves annotations from all the component files referenced by the current page (using 
INCL chunks) and prints the merged information. 

Dumping/restoring annotations and text 

Three commands, output-txt, output-ant, and output-all, produce djvused scripts. For instance, the fol¬ 
lowing shell command produces a djvused script, myfile.dsed, that recreates all the text and annotation data 
in document myfile.djvu. 

djvused myfile.djvu -e ’output-all’ > myfile.dsed 

Script myfile.dsed is a text file that can be easily edited. The following shell command then recreates the 
text and annotation information in file myfile.djvu. 

djvused myfile.djvu -f myfile.dsed -s 

Extracting a page 

Both commands save-page and save-page-with create a DjVu file representing the selected component file 
of a document. The following shell command, for instance, creates a file p05.djvu containing page 5 of 
document myfile.djvu. 

djvused myfile.djvu -e ’select 5; save-page p05.djvu ’ 

Each page of a document might import data from another component file using the so-called inclusion ( 
INCL ) chunks. Command save-page then produces a file with unresolved references to imported data. 
Such a file should then be made part of a multi-page document containing the required data in other compo¬ 
nent files. On the other hand, command save-page-with copies all the imported data into the output file. 
This file is directly usable. Yet collecting several such files into a multi-page document might lead to use¬ 
less data replication. 

Pre-computing thumbnails 

Commands set-thumbnails constructs thumbnails that can be later displayed by DjVu viewers. The fol¬ 
lowing shell command, for instance, computes thumbnails of size 64x64 pixels for all pages of file 
myfile.djvu. 

djvused myfile.djvu -e ’set-thumbnails 64’ -s 
DJVUSED COMMANDS 

Command lines might contain zero, one, or more djvused commands and an optional comment. Multiple 
djvused commands must be separated by a semicolon character Comments are introduced by the ’#’ 
character and extend until the end of the command line. 

Selection commands 

Multi-page DjVu documents are composed of a number of component files. Most component files describe 
a specific page of a document. Some component files contain information shared by several pages such as 
shared image data, shared annotations or thumbnails. Many djvused commands operate on selected compo¬ 
nent files. All component files are initially selected. The following commands are useful for changing the 
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selection. 

n Print the total number of pages in the document. 

Is List all component files in the document. Each line contains an optional page number, a letter 

describing the component file type, the size of the component file, and identifier of the component 
file. Component file type letters P, I, A, and T respectively stand for page data, shared image data, 
shared annotation data, and thumbnail data. Page numbers are only listed for component files con¬ 
taining page data. When it is set, the optional page title (see command set-page-title below) is 
displayed after the component file identifier. 

select [ fileid ] 

Select the component file identified by argument fileid. Argument fileid must be either a page 
number or a component file identifier. The select command selects all component files when the 
argument fileid is omitted. 

select-shared-ant 

Select a component file containing shared annotations. Only one such component file is supported 
by the current DjVu software. This component file usually contains annotations pertaining to the 
whole document as opposed to specific pages. An error message is displayed if there is no such 
component file. 

create-shared-ant 

Create and select a component file containing shared annotations. This command only selects the 
shared annotation component file if such a component file already exists. Otherwise it creates a 
new shared annotation component file and makes sure that it is imported by all pages in the docu¬ 
ment. 


showsel 

Shows the currently selected component files with the same format as command Is. 

Text and annotation commands 
print-pure-txt 

Print the text stored in the hidden text layer of the selected pages. A similar capability is offered 
by program djvutxt. Structural information is sometimes represented by control characters. Text 
from different pages is delimited by form feed characters ("\f"). Lines are delimited by newline 
characters ("\n"). Columns, regions, and paragraphs are sometimes delimited by vertical tab 
("\013"), group separators ("\035") and unit separators ("\037") respectively. 

print-txt 

Prints extensive hidden text information for the selected pages. This information describes the 
structure of the text on the document page and locates the structural elements in the page image. 
The syntax of this output is described later in this man page. 

remove-txt 

Remove the hidden text information from the selected component files. For instance, executing 
commands select and remove-txt removes all hidden text information from the Dj Vu document. 

set-txt [ djvusedtxtfile ] 

Insert hidden text information into the selected pages. The optional argument djvusedtxtfile names 
a file containing the hidden text information. This file must contain data similar to what is pro¬ 
duced by command print-txt. When the optional argument is omitted, the program reads the hid¬ 
den text information from the djvused script until reaching an end-of-file or a line containing a sin¬ 
gle period. 

output-txt 

Prints a djvused script that reconstructs the hidden text information for the selected pages. This 
script can later be edited and executed by invoking program djvused with option -f. 
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print-ant 

Prints the annotations of the selected component file. The annotation data is represented using a 
simple syntax described later in this document. 

print-merged-ant 

Merge the annotations stored in the selected component files with the annotations imported from 
other component files such as the shared annotation component file.. The annotation data is repre¬ 
sented using a simple syntax described later in this document. 

remove-ant 

Remove the annotation information from the selected component files. For instance, executing 
commands select and remove-ant removes all annotation information from the DjVu document. 

set-ant [ djvusedantfile ] 

Insert annotations into the selected component file. The optional argument djvusedantfile names a 
file containing the annotation data. This file must contain data similar to what is produced by 
command print-ant. When the optional argument is omitted, the program reads the annotation 
data from the djvused script itself until reaching an end-of-file or a line containing a single period. 

output-ant 

Print a djvused script that reconstructs the annotation information for the selected pages. This 
script can later be edited and executed by invoking program djvused with option -f. 

print-meta 

Print the meta-data part of the annotations for the selected component file. This command dis¬ 
plays a subset of the information printed by command print-ant using a different syntax. Meta¬ 
data are organized as key-value pairs. Each printed line contains the key name such as author, 
title ,etc., followed by a tab character ("\t") and a double-quoted string representing the UTF-8 
encoded meta-data value. 

remove-meta 

Remove the meta-data part of the annotations of the selected component files, 
set-meta [ djvusedmetafile ] 

Set the meta-data part of the annotations of the selected component file. The remaining part of the 
annotations is left unchanged. The optional argument djvusedmetafile names a file containing the 
meta-data. This file must contain data similar to what is produced by command print-meta. 
When the optional argument is omitted, the program reads the annotation data from the djvused 
script itself until reaching an end-of-file or a line containing a single period. 

print-xmp 

Print the XMP metadata string contained in the annotation chunk of the selected component file. 
This command displays in fact a subset of the information printed by command print-ant. 

remove-xmp 

Removes the XMP tag from the annotation chunk of the selected component file, 
set-xmp [xmpfile] 

Set the XMP metadata part of the annotations of the selected component file. The remaining part 
of the annotations is left unchanged. The optional argument xmpfile names a file containing the 
XMP metadata in a format similar to that produced by command print-xmp. When the optional 
argument is omitted, the program reads the XMP annotation data from the djvused script itself 
until reaching an end-of-file or a line containing a single period. 

output-all 

Print a djvused script that reconstructs both the hidden text and the annotation information for the 
selected pages. This script can later be edited and executed by invoking program djvused with 
option -f. 

Outline/bookmarks commands 


DjVuLibre-3.5 


5/22/2005 


4 



DJVUSED(l) 


DjVuLibre-3.5 


DJVUSED(l) 


print-outline 

Print the outline of the document. Nothing is printed if the document contains no outline. 

remove-outline 

Removes the outline from the document, 
set-outline [ djvusedoutlinefile ] 

Insert outline information into the document. The optional argument djvusedoutlinefile names a 
file containing the outline information. This file must contain data similar to what is produced by 
command print-outline. When the optional argument is omitted, the program reads the hidden 
text information from the djvused script until reaching an end-of-file or a line containing a single 
period. 

Thumbnail commands 
set-thumbnails sz 

Compute thumbnails of size szxsz pixels and insert them into the document. DjVu viewers can 
later display these thumbnails very efficiently without need to download the data for each page. 
Typical thumbnail size range from 48 to 128 pixels. 

remove-thumbnails 

Remove the pre-computed thumbnails from the DjVu document. New thumbnails can then be 
computed using command set-thumbnails. 

Save commands 

The above commands only modify the memory image of the DjVu document. The following commands 
provide means to save the modified data into the file system. 

save Save the modified DjVu document back into the input file djvufile specified by the arguments of 
the program djvused. Nothing is done if the DjVu file was not modified. Passing option -s pro¬ 
gram djvused is equivalent to executing command save before exiting the program. 

save-bundled filename 

Save the current DjVu document as a bundled multi-page DjVu document named filename. A sim¬ 
ilar capability is offered by program djvmcvt. 

save-indirect filename 

Save the current DjVu document as an indirect multi-page DjVu document. The index file of the 
indirect document will be named filename. All other files composing the indirect document will 
be saved into the same directory as the index file. A similar capability is offered by program 

djvmcvt. 

save-page filename 

Save the selected component file into DjVu file filename. The selected component file might 
import data from another component file using the so-called inclusion (INCL ) chunks. This com¬ 
mand then produces a file with unresolved references to imported data. Such a file should then be 
made part of a multi-page document containing the required data in other component files. 

save-page-with filename 

Save the selected component file into DjVu file, filename. All data imported from other component 
files is copied into the output file as well. This command always produces a usable DjVu file. On 
the other hand, collecting several such files into a multi-page document might lead to useless data 
replication. 

Miscellaneous commands 

help Display a help message listing all commands supported by djvused. 

dump Display the EA IFF 85 structure of the document or of the selected component file. A similar capa¬ 
bility is offered by program djvudump. 
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size Display the width and the height of the selected pages. The dimensions of each page are displayed 
using a syntax suitable for direct insertion into the <EMBED...x/EMBED> tags. 

set-page-title title 

Sets a page title for the selected page. When page titles are available, recent versions of the 
DjVuLibre viewers display these page titles instead of page numbers and also accept them in page 
selection options. Command Is can be used to see both the page titles and page identifiers. To 
unset a page title, simply make it equal to the page identifier. 

DJVUSED FILE FORMATS 

Djvused uses a simple parenthesized syntax to represent both annotations and hidden text. 

* This syntax is the native syntax used by DjVu for storing annotations. Program djvused simply com¬ 
presses the annotation data using the bzz(l) algorithm. 

* This syntax differs from the native syntax used by DjVu for storing the hidden text. Program djvused 
performs the translations between the compact binary representation used by DjVu and the easily modi¬ 
fiable parenthesized syntax. 

General syntax 

Djvused files are ASCII text files. The legal characters in djvused files are the printable ASCII characters 
and the space, tab, cr, and nl characters. Using other characters has undefined results. 

Djvused files are composed of a sequence of expressions separated by blank characters (space, tab, cr, or 
nl). There are four kind of expressions, namely integers, symbols, strings and lists. 

Integers: 

Integer numbers are represented by one or more digits, with the usual interpretation. 

Symbols: 

Symbols, or identifiers, are sequences of printable ascii characters representing a name or a key¬ 
word. Acceptable characters are the alpha-numeric characters, the underscore the minus char¬ 
acter and the hash character "#". Names should not begin with a digit or a minus character. 

Strings: 

Strings denote an arbitrary sequence of bytes, usually interpreted as a sequence of UTF-8 encoded 
characters. Strings in djvused files are similar to strings in the C language. They are surrounded 
by double quote characters. Certain sequences of characters starting with a backslash ("\") have a 
special meaning. A backslash followed by letter "a", "b", "t", "n", "v", "f", "r", "\", and stands for 
the ascii character BEL(007), BS(008), HT(009), LF(010), VT(Oll), FF(012), CR(013), BACK- 
SFASH(134) and DOUBFEQUOTE(042) respectively. A backslash followed by one to three dig¬ 
its stands for the byte whose octal code is expressed by the digits. All other backslash sequences 
are illegal. All non printable ascii characters must be escaped. 

Fists: Fists are sequence of expressions separated by blanks and surrounded by parentheses. All expres¬ 

sions types are acceptable within a list, including sub-lists. 

Hidden text syntax 

The building blocks of the hidden text syntax are lists representing each structural component of the hidden 
text. Structural components have the following form: 

(type xmin ymin xmax ymax ...) 

The symbol type must be one of page, column, region, para, line, word, or char, listed here by decreasing 
order of importance. The integers xmin, ymin, xmax, and ymax represent the coordinates of a rectangle 
indicating the position of the structural component in the page. Coordinates are measured in pixels and 
have their origin at the bottom left corner of the page. The remaining expressions in the list either is a sin¬ 
gle string representing the encoded text associated with this structural component, or is a sequence of struc¬ 
tural components with a lesser type. 
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The hidden text for each page is simply represented by a single structural element of type page. Various 
level of structural information are acceptable. For instance, the page level component might only specify a 
page level string, or might only provide a list of lines, or might provide a full hierarchy down to the individ¬ 
ual characters. 

Outline/Bookmark syntax 

The outline syntax is a single list of the form 

(bookmarks...) 

The first element of the list is symbol bookmarks. The subsequent elements are lists representing the 
toplevel outline entries. Each outline entry is represented by a list with the following form: 

{title url ...) 

The string title is the title of the outline entry. The destination string url can be either an arbitrary percent 
encoded URL, or composed of the hash character ("#") followed by a page name or number, or composed of 
the question mark character ("?") followed by cgi-style arguments interpreted by the djvu viewer. The 
remaining expressions in the list describe subentries of this outline entry. 

Annotation syntax 

Annotations are represented by a sequence of annotation expressions. The following annotation expres¬ 
sions are recognized: 

(background color) 

Specify the color of the viewer area surrounding the DjVu image. Colors are represented with the 
XI1 hexadecimal syntax #RRGGBB. For instance, #000000 is black and #FFFFFF is white. 

(zoom zoomvalue) 

Specify the initial zoom factor of the image. Argument zoomvalue can be one of stretch, 
one2one, width, page, or composed of the letter d followed by a number in range 1 to 999 repre¬ 
senting a zoom factor (such as in d300 or dl50 for instance.) 

(mode modevalue ) 

Specify the initial display mode of the image. Argument modevalue is one of color, bw, fore, or 
back. 

(align horzalign vertalign ) 

Specify how the image should be aligned on the viewer surface. By default the image is located in 
the center. Argument horzalign can be one of left, center, or right. Argument vertalign can be 

one of top, center, or bottom. 

(maparea url comment area ...) 

Define an hyper-link for the specified destination. 

Argument url can have one of the following forms: 

href 

(url href target) 

where href is a string representing the destination and target is a string representing the target 
frame for the hyper-link, as defined by the HTML anchor tag <A>. The destination string href can 
be either an arbitrary percent encoded URL, or composed of the hash character ("#") followed by a 
page name or number, or composed of the question mark character ("?") followed by cgi-style 
arguments interpreted by the djvu viewer. Page numbers may be prefixed with an optional sign to 
represent a page displacement. For instance the strings "#-l" and "#+l" can be used to access the 
previous page and the next page. 

Argument comment is a string that might be displayed by the viewer when the user moves the 
mouse over the hyper-link. 
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Argument area defines the shape and the location of the hyperlink. The following forms are rec¬ 
ognized: 

(rect x in in ymin width height ) 

(oval xin in ymin width height ) 

(poly xO yO xl yl ...) 

(text x in in ymin width height ) 

(line xO yO xl yl) 

All parameters are numbers representing coordinates. Coordinates are measured in pixels and 
have their origin at the bottom left comer of the page. 

The remaining expressions in the maparea list represent the visual effect associated with the 
hyper-link. 

A first set of options defines how borders are drawn for rect, oval, polygon, or text hyperlink 
areas. 

(none) 

(xor) 

(border color) 

(shadow_in [thickness]) 

(shadow_out [ thickness ]) 

(shadow_ein [thickness]) 

(shadow_eout [ thickness ]) 

where parameter color has syntax #RRGGBB as described above, and parameter thickness is an 
integer in range 1 to 32. The last four border options are only supported for rect hyperlink areas. 
The default border is a simple black line. Border options do not apply to line areas. 

When a border option is specified, the border becomes visible when the user moves the mouse 
over the hyperlink. The border may be made always visible by using the following option: 

(border_avis) 

The following two options may be used with rect hyperlink areas. The complete area will be 
highlighted using the specified color at the specified opacity (0-100, default 50). 

(hilite color) 

(opacity op) 

This is often used with an empty URL for simply emphasizing a specific segment of an image. 

The following three options may be used with line areas to specify an optional ending arrow, the 
line width and color. The default is a black line with width 1 and without arrow. 

(arrow) 

(width w) 

(lineclr color) 

Finally the following three options can be used with text areas. The default background color is 
transparent. The default text color is black. The pushpin option indicates that the text is symbol¬ 
ized by a small pushpin icon. Clicking the icon reveals the text. 

(backclr bkcolor) 

(textclr txtcolor) 

(pushpin) 

(metadata ... {key value )...) 

Define meta-data entries. Each entry is identified by a symbol key representing the nature of the 
meta data entry. The string value represents the value associated with the corresponding key. Two 
sets of keys are noteworthy: keys borrowed from the BibTex bibliography system, and keys 
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borrowed from the PDF Doclnfo metadata. BibTex keys are always expressed in lowercase, such 
as year, booktitle, editor, author, etc.. Doclnfo keys start with an uppercase letter, such as Title, 
Author, Subject, Creator, Produced, Trapped, CreationDate, and ModDate. The values asso¬ 
ciated with the last two keys should be dates expressed according to RFC 3339. 

LIMITATIONS 

The current version of program djvused only supports selecting one component file or all component files. 
There is no way to select only a few component files. 

CREDITS 

This program was initially written by Leon Bottou <leonb@users.sourceforge.net> and was improved by 
Yann Le Cun <profshadoko@users.sourceforge.net>, Florin Nicsa, Bill Riemers <docbill@source- 
forge.net> and many others. 

SEE ALSO 

djvu(l), djvutxt(l), djvmcvt(l), djvudump(l), bzz(l), Emacs djvused front end djvu.el on GNU Elpa 
repository. 
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NAME 

djvuserve - Generate indirect DjVu documents on the fly. 

DESCRIPTION 

Program djvuserve is a CGI program that can be executed by a HTTP server for serving DjVu documents. 
This program is able to convert a bundled multi-page document into an indirect document on the fly. 

USING DJVUSERVE 

Program djvuserve must first be installed as a CGI program for your web server. There are several ways to 
achieve this. The Apache web server, for instance, often defines a specific directory for CGI programs 
using the ScriptAlias directive. Assume that the file httpd.conf contains the following line: 

ScriptAlias /cgi-bin/ "/var/www/cgi-bin " 

It is then sufficient to create a small executable shell script /var/www/cgi-bin/ djvuserve containing the fol¬ 
lowing lines: 

#!/bin/sh 

exec /full/path/to/ djvuserve 

Suppose that a large bundled multi-page DjVu document is available at the following URL. 
http://s erver/dir/doc. djvu 

The CGI program djvuserve lets you access this same document as an indirect multi-page DjVu document 
using the following URL. 

http .-//server /cgi-bin/d}vuserve/dir/doc.djvu/index.d}vu 

Serving indirect multi-page DjVu documents provides for efficiently browsing large document without 
transferring unnecessary pages over the network. See djvu(l) for more information. 

Furthermore djvuserve searches certain keywords among the CGI arguments of the URL. The keyword 
bundled forces serving a bundled document using 

http .-//server /cgi-bin/d}vuserve/dir/doc.djvLi‘?bundled 

The keyword download inserts a content disposition HTTP header that suggests to display a save dialog 
instead of displaying the document. 

http .-//server /cgi-bin/d}vuserve/dir/doc.djvLi‘Mownload 

USING DJVUSERVE AS A HANDLER 

The Apache web server provides a way to automatically execute djvuserve for all DjVu documents. This 
can be achieved using the following directives in either the Apache configuration file or the .htaccess files. 

Action djvu-server /cgi-bin/djvuserve/ 

AddHandler djvu-server .djvu 

Apache then executes program djvuserve for serving all DjVu files. Providing the URL of DjVu file serves 
this Dj Vu file as usual, except that bundled multipage documents are converted to indirect documents on the 
fly. This convenience comes at the expense of the computational cost of executing djvuserve whenever a 
DjVu file is requested. 

TECHNICAL DETAILS 

Program djvuserve provides a mean to directly access any component of a bundled multi-page DjVu docu¬ 
ment can be accessed using an extended URL. Suppose that the component file representing page 1 is 
named pOOOl.djvu. The following URL provides a direct access to this page: 

http .-/Aen’er/cgi-bin/dj vuserve/ di r/doc. djvu/ pOOO 1 .dj vu 

It is preferred however to access individual pages using the CGI style arguments described in nsdejavu(l), 
as in the following URL. 
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http .-//server/cgi-bin/d}\usev\e/dir/doc.djvu‘M}\iiopts&page=l2 

The special component file name index.djvu is recognized as a request for the index of the corresponding 
indirect multi-page document. In fact, when you access a bundled document using djvuserve, the browser 
gets redirected to the following URL: 

http .-//server/cgi-bin/d}vuserve/dir/doc.djvLi/index.d}vu 

and then behaves as if the bundled file was a directory containing the various component files of an equiva¬ 
lent indirect document. 

ACCESS CONTROL 

Program djvuserve, like many CGI programs, bypasses a number of access protections established in a web 
server. Assume for instance that your web site contains DjVu files protected by a password. Program 
djvuserve knows nothing about this protection and will happily serve any DjVu file associated with a valid 
URL. 

Access control with djvuserve can be implemented by first remembering that the web server always exe¬ 
cutes program djvuserve via shell script /var/www/cgi-bin/ djvuserve. 

This script can decide to execute the real program djvuserve on the basis of the target filename available in 
the environment variable PATH_TRANSLATED. 

There can be several such scripts providing access to various collections of DjVu files. Each of these 
scripts can be password protected using the usual methods supported by your web server. 

KNOWN BUGS 

Hyperlinks specified using a relative URL may not work with djvuserve. These URLs are relative to the 
URL of the DjVu document. Yet djvuserve changes the apparent document URL http://server/dir/doc.djvu 
into the more complicated URL http://server/cgi-bin/d}vuserve/dir/doc.djvu/index.d}vu. The extra com¬ 
ponents change the interpretation of relative URLs. 

CREDITS 

This program was written by Leon Bottou <leonb@users.sourceforge.com>. 

SEE ALSO 

djvu(l), djvmcvt(l), nsdejavu(l) 
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NAME 

djvutxt - Extract the hidden text from DjVu documents. 

SYNOPSIS 

djvutxt [options] inputdjvufile [outputtxtfile] 


DESCRIPTION 

Program djvutxt decodes the hidden text layer of a DjVu document inputdjvufile and prints it into file out¬ 
puttxtfile or on the standard output. The hidden text layer is usually generated with the help of an optical 
character recognition software. 

Without options -detail and -escape, this program simply outputs the UTF-8 text. Option -detail cause the 
output of S-expressions describing the text and its location. Option -escape uses C-style escape sequences 
to represent nonprintable non-ASCII characters. 


OPTIONS 

—page= page spec 

Specify which pages should be processed. When this option is not specified, the text of all pages 
of the documents is concatenated into the output file. The page specification pagespec contains 
one or more comma-separated page ranges. A page range is either a page number, or two page 
numbers separated by a dash. For instance, specification 1-10 outputs pages 1 to 10, and specifica¬ 
tion 1 , 3 , 99999-4 outputs pages 1 and 3, followed by all the document pages in reverse order up to 
page 4. 

—detail —keyword 

This options causes djvutxt to output S-expressions specifying the position of the text in the page. 
See the manual page djvused(l) for a description of the output format. Argument keyword speci¬ 
fies the maximum level of detail for which text location is reported. The recognized values are: 
page, column, region, para, line, word, and char. All other values are interpreted as char. 

—escape 

Output escape sequences of the form "ooo" for all non ASCII or non printable UTF-8 characters 
and for the backslash character. 


REMARKS 

Use program djvused(l) for more control over the text layer. 


CREDITS 

This program was initially written by Andrei Erofeev <andrew_erofeev@yahoo.com> and was then 
improved Bill Riemers <docbill@sourceforge.net> and many others. It was then rewritten to use the 
ddjvuapi by Feon Bottou <leonb@sourceforge.net>. 


SEE ALSO 

djvu(l), djvused(l) 
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NAME 

djvutoxml, djvuxmlparser - DjVuLibre XML Tools. 

SYNOPSIS 

djvutoxml [ options ] inputdjvufile [ outputxmlfile ] 
djvuxmlparser [ -o djvufile ] inputxmlfile 


DESCRIPTION 

The DjVuLibre XML Tools provide for editing the metadata, hyperlinks and hidden text associated with 
DjVu files. Unlike djvused(l) the DjVuLibre XML Tools rely on the XML technology and can take advan¬ 
tage of XML editors and verifiers. 

DJVUTOXML 

Program djvutoxml creates a XML file outputxmlfile containing a reference to the original DjVu document 
inputdjvufile as well as tags describing the metadata, hyperlinks, and hidden text associated with the DjVu 
file. 

The following options are supported: 

—page pagenum 

Select a page in a multi-page document. Without this option, djvutoxml outputs the XML corre¬ 
sponding to all pages of the document. 

—with-text 

Specifies the HIDDENTEXT element for each page should be included in the output. If specified 
without the —with-anno flag then the —without-anno is implied. If none of the —with-text, 
—without-text, —with-anno, or —without-anno, flags are specified, then the —with-text and 
—with-anno flags are implied. 

—without-text 

Specifies not to output the HIDDENTEXT element for each page. If specified without the 
—without-anno flag then the —with-anno flag is implied. 

—with-anno 

Specifies the area MAP element for each page should be included in the output. If specified with¬ 
out the —with-text flag then the —without-text flag is implied. 

—without-anno 

Specifies the area MAP element for each page should not be included in the output. If specified 
without the —without-text flag then the —with-text flag is implied. 


DJVUXMLPARSER 

Files produced by djvutoxml can then be modified using either a text editor or a XML editor. Program 
djvuxmlparser parses the XML file inputxmlfile in order to modify the metadata of the corresponding 
DjVu file. 

-o djvufile 

In principle the target DjVu file is the file referenced by the OBJECT element of the XML file. 
This option provides the means to override the filename specified in the OBJECT element. 

DJVUXML DOCUMENT TYPE DEFINITION 

The document type definition file (DTD) 

/usr/share/djvu/pubtext/DjVuXML-s.dtd 
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defines the input and output of the DjVu XML tools. 

The DjVuXML-s DTD is a simplification of the HTML DTD: 

http://www.w3c.org/TR/1998/REC-html40-19980424/sgml/dtd.html 

with a few new attributes added specific to DjVu. Each of the specified pages of a DjVu document are rep¬ 
resented as OBJECT elements within the BODY element of the XML file. Each OBJECT element may 
contain multiple PARAM elements to specify attributes like page name, resolution, and gamma factor. 
Each OBJECT element may also contain one HIDDENTTEXT element to specify the hidden text (usu¬ 
ally generated with an OCR engine) within the DjVu page. In addition each OBJECT element may refer¬ 
ence a single area MAP element which contains multiple AREA elements to represent all the hyperlink 
and highlight areas within the DjVu document. 

PARAM Elements 

Legal PARAM elements of a DjVu OBJECT include but are not limited to PAGE for specifying the page- 
name, GAMMA for specifying the gamma correction factor (normally 2.2), and DPI for specifying the 
page resolution. 

HIDDENTEXT Elements 

The HIDDENTEXT elements consists of nested elements of PAGECOLUMNS, REGION, PARA¬ 
GRAPH, LINE, and WORD. The most deeply nested element specified, should specify the bounding 
coordinates of the element in top-down orientation. The body of the most deeply nested element should 
contain the text. Most DjVu documents use either LINE or WORD as the lowest level element, but any 
element is legal as the lowest level element. A white space is always added between WORD elements and 
a line feed is always added between LINE elements. Since languages such as Japanese do not use spaces 
between words, it is quite common for Asian OCR engines to use WORD as characters instead. 

MAP Elements 

The body of the MAP elements consist of AREA elements. In addition to the attributes listed in 

http://www.w3.Org/TR/1998/REC-html40-19980424/struct/objects.html#edef-AREA, 

the attributes bordertype, bordercolor, border, and highlight have been added to specify border type, 
border color, border width, and highlight colors respectively. Legal values for each of these attributes are 
listed in the DjVuXML-s DTD. In addition, the shape oval has been added to the legal list of shapes. An 
oval uses a rectangular bounding box. 


BUGS 

Perhaps it would have been better to use CC2 style sheets with standard HTML elements instead of defin¬ 
ing the HIDDENTEXT element. 

CREDITS 

The DjVu XML tools and DTD were written by Bill C. Riemers <docbill@sourceforge.net> and Fred 
Crary. 


SEE ALSO 

djvu(l), djvused(l), and utf8(7). 
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NAME 

djvutoxml, djvuxmlparser - DjVuLibre XML Tools. 

SYNOPSIS 

djvutoxml [ options ] inputdjvufile [ outputxmlfile ] 
djvuxmlparser [ -o djvufile ] inputxmlfile 


DESCRIPTION 

The DjVuLibre XML Tools provide for editing the metadata, hyperlinks and hidden text associated with 
DjVu files. Unlike djvused(l) the DjVuLibre XML Tools rely on the XML technology and can take advan¬ 
tage of XML editors and verifiers. 

DJVUTOXML 

Program djvutoxml creates a XML file outputxmlfile containing a reference to the original DjVu document 
inputdjvufile as well as tags describing the metadata, hyperlinks, and hidden text associated with the DjVu 
file. 

The following options are supported: 

—page pagenum 

Select a page in a multi-page document. Without this option, djvutoxml outputs the XML corre¬ 
sponding to all pages of the document. 

—with-text 

Specifies the HIDDENTEXT element for each page should be included in the output. If specified 
without the — with-anno flag then the — without-anno is implied. If none of the —with-text, 
—without-text, —with-anno, or —without-anno, flags are specified, then the —with-text and 
—with-anno flags are implied. 

—without-text 

Specifies not to output the HIDDENTEXT element for each page. If specified without the 
—without-anno flag then the —with-anno flag is implied. 

—with-anno 

Specifies the area MAP element for each page should be included in the output. If specified with¬ 
out the —with-text flag then the —without-text flag is implied. 

—without-anno 

Specifies the area MAP element for each page should not be included in the output. If specified 
without the —without-text flag then the —with-text flag is implied. 


DJVUXMLPARSER 

Files produced by djvutoxml can then be modified using either a text editor or a XML editor. Program 
djvuxmlparser parses the XML file inputxmlfile in order to modify the metadata of the corresponding 
DjVu file. 

-o djvufile 

In principle the target DjVu file is the file referenced by the OBJECT element of the XML file. 
This option provides the means to override the filename specified in the OBJECT element. 

DJVUXML DOCUMENT TYPE DEFINITION 

The document type definition file (DTD) 

/usr/share/djvu/pubtext/DjVuXML-s.dtd 
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defines the input and output of the DjVu XML tools. 

The DjVuXML-s DTD is a simplification of the HTML DTD: 

http://www.w3c.org/TR/1998/REC-html40-19980424/sgml/dtd.html 

with a few new attributes added specific to DjVu. Each of the specified pages of a DjVu document are rep¬ 
resented as OBJECT elements within the BODY element of the XML file. Each OBJECT element may 
contain multiple PARAM elements to specify attributes like page name, resolution, and gamma factor. 
Each OBJECT element may also contain one HIDDENTTEXT element to specify the hidden text (usu¬ 
ally generated with an OCR engine) within the DjVu page. In addition each OBJECT element may refer¬ 
ence a single area MAP element which contains multiple AREA elements to represent all the hyperlink 
and highlight areas within the DjVu document. 

PARAM Elements 

Legal PARAM elements of a DjVu OBJECT include but are not limited to PAGE for specifying the page- 
name, GAMMA for specifying the gamma correction factor (normally 2.2), and DPI for specifying the 
page resolution. 

HIDDENTEXT Elements 

The HIDDENTEXT elements consists of nested elements of PAGECOLUMNS, REGION, PARA¬ 
GRAPH, LINE, and WORD. The most deeply nested element specified, should specify the bounding 
coordinates of the element in top-down orientation. The body of the most deeply nested element should 
contain the text. Most DjVu documents use either LINE or WORD as the lowest level element, but any 
element is legal as the lowest level element. A white space is always added between WORD elements and 
a line feed is always added between LINE elements. Since languages such as Japanese do not use spaces 
between words, it is quite common for Asian OCR engines to use WORD as characters instead. 

MAP Elements 

The body of the MAP elements consist of AREA elements. In addition to the attributes listed in 

http://www.w3.Org/TR/1998/REC-html40-19980424/struct/objects.html#edef-AREA, 

the attributes bordertype, bordercolor, border, and highlight have been added to specify border type, 
border color, border width, and highlight colors respectively. Legal values for each of these attributes are 
listed in the DjVuXML-s DTD. In addition, the shape oval has been added to the legal list of shapes. An 
oval uses a rectangular bounding box. 


BUGS 

Perhaps it would have been better to use CC2 style sheets with standard HTML elements instead of defin¬ 
ing the HIDDENTEXT element. 

CREDITS 

The DjVu XML tools and DTD were written by Bill C. Riemers <docbill@sourceforge.net> and Fred 
Crary. 


SEE ALSO 

djvu(l), djvused(l), and utf8(7). 
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